Research Method Dec5 Ed.

Abstract

This study employs a mathematical modeling approach.

General

In the mathematical modeling process, we draw on the research of adan_queueing2002 regarding queuing theory, specifically focusing on student arrival distribution between the third and fourth class sessions, the service time distribution at service windows, and the observed number of windows. For 10 Days/2weeks.

Modeling

1. Data Analysis

Arrival data analysis

Plot the collected data (a person entering the cafeteria at a certain time) as a histogram
Attempt to fit common distributions (Normal, Poisson, Exponential, Lognormal, Gamma, etc.) to determine the specific distributional form of the arrival and service process.

The collected data were entered into a computer and analyzed for summary statistics and visual representations across different variables using a statistical system. The frequency distribution of the survey results is shown in Figure XX.
We reviewed the observed data and removed certain outliers. or:

Enhanced outlier handling methods

Direct deletion of outliers may result in loss of information and is recommended:

Use statistical methods (e.g., interquartile distance for box-and-line plots) to identify outliers and analyze their causes.
If the outliers have special significance (e.g., congestion due to sudden activity), they can be modeled or recorded separately.

General

Main
1. Input data:
- Enter data (Each value represents a time that a student arrive the cafeteria)
2. Distribution fitting:
- Use the distributions (normal, Poisson, exponential, lognormal, gamma) from the scipy.stats module.
- Fit each distribution and estimate its parameters.
3. Model selection criteria:
- AIC (Akaike Information Criterion): a measure of model fit and smoothness.
- BIC (Bayesian Information Criterion): considers the effect of the number of data points on model fit.
4. Output results:
- Name of the fitted distribution, parameters, log-likelihood values, AIC and BIC.
- The optimal distribution is selected by the smaller AIC/BIC value.

Extra

Addition of distributional tests:
- Kolmogorov-Smirnov (K-S) test: checks that the data fits the fitted distribution.
- Anderson-Darling test: more sensitive to fitting the tails of the distribution.
Multi-distribution combination test:
- If a single distribution is not fitted well, try a Mixed Distribution Model (e.g. Mixed Gaussian) to capture complex patterns.
  The probability density function (PDF) of the Gaussian Mixture Model is:

p (x) = k = 1 \sum K π_{k} N (x ∣ μ_{k}, σ_{k}^{2})

Where:

$K$ is the number of Gaussian distributions (i.e., the number of mixture components).
$π_{k}$ is the weight of the $k$ -th Gaussian distribution, satisfying $\sum_{k = 1}^{K} π_{k} = 1$ .
$N (x ∣ μ_{k}, σ_{k}^{2})$ represents the $k$ -th Gaussian distribution, with mean $μ_{k}$ and variance $σ_{k}^{2}$ .

import numpy as np
import scipy.stats as stats
import pandas as pd
 
arrival_times = list(map(float, input("Please Enter the values（Split by Space）：").split()))
 
# This part can be modified after the data is collected
distributions = {
    "Normal": stats.norm,
    "Poisson": stats.poisson,
    "Exponential": stats.expon,
    "Lognormal": stats.lognorm,
    "Gamma": stats.gamma
}
 
results = []
 
for name, distribution in distributions.items():
    try:
        params = distribution.fit(arrival_times)
        
        pdf = distribution.pdf(arrival_times, *params)
        
        log_likelihood = np.sum(np.log(pdf))
        
        # AIC
        k = len(params) 
        aic = 2 * k - 2 * log_likelihood
        
        # BIC
        n = len(arrival_times) 
        bic = k * np.log(n) - 2 * log_likelihood
 
        results.append({
            "Distribution": name,
            "Parameters": params,
            "Log-Likelihood": log_likelihood,
            "AIC": aic,
            "BIC": bic
        })
    except Exception as e:
        results.append({
            "Distribution": name,
            "Error": str(e)
        })
 
# DataFrame
results_df = pd.DataFrame(results)
import ace_tools as tools; tools.display_dataframe_to_user(name="Distribution Fit Results", dataframe=results_df)

2. Model Establishment

Based on assumptions about the data and usual observations, we assume that the arrival distribution satisfies a certain distribution (e.g., skewed distribution). Most students usually enter the cafeteria right after classes end, and the arrival rate of people decreases steadly over time. We assume that the service process is a poisson process, i.e., each service time is completely random (not depend on any previous situation).

So the $G / M / c$ model should be used. (Note: the actual model selection needs to be based on actual data)

In the $G / M / c$ model, the arrival process follows a General distribution, the service time follows a Poisson process, and there are $c$ parallel service windows.

Basic Parameters

Symbol	Explanation	Unit
$λ$	Average arrival rate of customers	$p eo pl e / s$
$μ$	Average service rate of a single service window	$ser v i ces / s$
$c$	Number of service windows	$N u mb er$
$ρ$	system utilization rate	$/$
$C_{a}$	Coefficient of variation	$/$
$W_{q}$	Mean Waiting Time	$s$
$L_{q}$	Average number of people in the queue	$N u mb er$
$P_{0}$	Probability that the system is empty	$/$
Which:

ρ = \frac{λ}{c μ}

where $ρ < 1$ is necessary for system stability.

$C_{a}$ : coefficient of variation (ratio of standard deviation to mean) of the arrival time distribution: $C_{a} = \frac{σ _{a}}{1/ λ}$

Standard deviation of arrival times $σ_{a}$

We have a Set of data $[T_{a 1}, T_{a 2}, \dots, T_{an}]$ ，We can estimate $σ_{a}$ by：

σ_{a} = \frac{1}{n - 1} i = 1 \sum n (T_{ai} - \overset{ˉ}{T_{a}})^{2}

$\overset{ˉ}{T_{a}} = \frac{1}{n} \sum_{i = 1}^{n} T_{ai}$ is the average of arrival times

Mean Waiting Time $W_{q}$

Using a generalized form of the Pollaczek-Khinchin (P-K) formula, you can compute the average wait time in the queue $W_{q}$ :

W_{q} = \frac{C _{a}^{2} + 1}{2} \cdot W_{q, M / M / c}

where $W_{q, M / M / c}$ is the average waiting time of the $M / M / c$ model, Eq:

W_{q, M / M / c} = \frac{L _{q, M / M / c}}{λ}

Average number of people in the queue $L_{q}$

And $L_{q, M / M / c}$ (the average number of people in the queue) is based on the Erlang-C formula:

L_{q, M / M / c} = \frac{P _{0} \cdot \frac{( λ / μ ) ^{c}}{c !} \cdot \frac{c ρ}{( 1 - ρ ) ^{2}}}{( \sum _{k = 0}^{c - 1} \frac{( λ / μ ) ^{k}}{k !} ) + \frac{( λ / μ ) ^{c}}{c !} \cdot \frac{1}{1 - ρ}}

Revised $L_{q}$ is defined by：

L_{q} \approx \frac{C _{a}^{2} + 1}{2} \cdot L_{q, M / M / c}

Probability that the system is empty $P_{0}$

$P_{0}$ : the probability that the system is empty, which can be calculated by the following formula: $P_{0} = (k = 0 \sum c - 1 \frac{( λ / μ ) ^{k}}{k !} + \frac{( λ / μ ) ^{c}}{c !} \cdot \frac{1}{1 - ρ})^{- 1}$

Average system wait time $W$

The average wait time in the system $W$ includes the wait time in the queue $W_{q}$ and the service time:

W = W_{q} + \frac{1}{μ}

Average number of people in the system $L$

The average number of people in the system $L$ includes the number of people in the queue $L_{q}$ and the number of people being served:

L = L_{q} + \frac{λ}{μ}

Extra

Dynamic modeling: student arrival rates and service times are currently assumed to be fixed distributions, but may vary in real scenarios (e.g., peak hour fluctuations). It is possible to:
- Introduce time-varying parameter models (e.g., time segmentation, smoothing functions).
- Use Markov chain models to simulate dynamic service window states.
Explore multi-model comparisons:
- In addition to $G / M / c$ , try $G / G / c$ or more complex models (e.g., queuing networks) to analyze the overall service system.
- Compare model performance and choose the model that best fits the scenario.

3. Model Using

import math
 
def calculate_queue_metrics(mu, lambd, c, sigma_a=None):
    # Step 1: Calculate rho
    rho = lambd / (c * mu)
    if rho >= 1:
        raise ValueError("The system is unstable (rho >= 1). Please adjust input parameters.")
    
    # Step 2: Calculate C_a
    C_a = (sigma_a * lambd) if sigma_a else 1  # Default to 1 if sigma_a is not provided
 
    # Step 3: Calculate P0
    sum_k = sum((lambd / mu)**k / math.factorial(k) for k in range(c))
    P0 = 1 / (sum_k + (lambd / mu)**c / math.factorial(c) / (1 - rho))
    
    # Step 4: Calculate L_q (classical M/M/c)
    L_q_MM_c = (P0 * ((lambd / mu)**c / math.factorial(c)) * (c * rho / (1 - rho)**2)) / (
        sum_k + (lambd / mu)**c / math.factorial(c) / (1 - rho)
    )
    
    # Step 5: Adjust L_q for generalized case
    L_q = (C_a**2 + 1) / 2 * L_q_MM_c
    
    # Step 6: Calculate W_q
    W_q = L_q / lambd
    
    # Step 7: Calculate W
    W = W_q + 1 / mu
    
    # Step 8: Calculate L
    L = L_q + lambd / mu
    
    return {"W": W, "L": L, "P0": P0}
 
# Example usage
mu = 5  # Service rate
lambd = 4  # Arrival rate
c = 3  # Number of servers
sigma_a = 0.1  # Optional, variability of arrival time
 
result = calculate_queue_metrics(mu, lambd, c, sigma_a)
print(result)

FootNote

Use $C_{a}$ (Coefficient of Variation of Arrival Time) to correct the formula of $M / M / c$ , that is, we can get the result of $G / M / c$ .

Outputs

We enter the observed data into the code segment and conclude that

The average arrival time is…
The average number of people in the queue is…

Extra

Enhanced parameter estimation methods: Currently, parameter estimation is mainly based on best fit, but a more rigorous calculation of confidence intervals should be incorporated.
- Calculate confidence intervals for parameter estimates using the Bootstrapping method.
- Check the stability of the estimates by estimating them separately for different time periods and

My Vault

Explorer

Research Method Dec5 Ed.

Abstract

General

Modeling

1. Data Analysis

Enhanced outlier handling methods

General

Extra

2. Model Establishment

So the $G / M / c$ model should be used. (Note: the actual model selection needs to be based on actual data)

Basic Parameters

Standard deviation of arrival times $σ_{a}$

Mean Waiting Time $W_{q}$

Average number of people in the queue $L_{q}$

Probability that the system is empty $P_{0}$

Average system wait time $W$

Average number of people in the system $L$

Extra

3. Model Using

FootNote

Outputs

Extra

Graph View

Table of Contents

Backlinks

My Vault

Explorer

Research Method Dec5 Ed.

Abstract

General

Modeling

1. Data Analysis

Enhanced outlier handling methods

General

Extra

2. Model Establishment

So the G/M/c model should be used. (Note: the actual model selection needs to be based on actual data)

Basic Parameters

Standard deviation of arrival times σa​

Mean Waiting Time Wq​

Average number of people in the queue Lq​

Probability that the system is empty P0​

Average system wait time W

Average number of people in the system L

Extra

3. Model Using

FootNote

Outputs

Extra

Graph View

Table of Contents

Backlinks

So the $G / M / c$ model should be used. (Note: the actual model selection needs to be based on actual data)

Standard deviation of arrival times $σ_{a}$

Mean Waiting Time $W_{q}$

Average number of people in the queue $L_{q}$

Probability that the system is empty $P_{0}$

Average system wait time $W$

Average number of people in the system $L$