MAR 25

Research Question

Based on the theories of professional Campus Cafeteria, how do I solve crowding problems using high school rigs and technology?

Approach

(Quantitative) Mainly using queueing theory to solve the crowding problem in our school’s cafeteria.

Design

Data collection
Data correction
Data analysis

Proposed Method

Research Method Dec5 Ed.
Research method

Abstract

This study employs a mathematical modeling approach.

General

In the mathematical modeling process, we draw on the research of adan_queueing2002 regarding queuing theory, specifically focusing on student arrival distribution between the third and fourth class sessions, the service time distribution at service windows, and the observed number of windows. For 10 Days/2weeks.

Modeling

1. Data Analysis

Arrival data analysis

Plot the collected data (a person entering the cafeteria at a certain time) as a histogram
Attempt to fit common distributions (Normal, Poisson, Exponential, Lognormal, Gamma, etc.) to determine the specific distributional form of the arrival and service process.

The collected data were entered into a computer and analyzed for summary statistics and visual representations across different variables using a statistical system. The frequency distribution of the survey results is shown in Figure XX.
We reviewed the observed data and removed certain outliers. or:

Enhanced outlier handling methods

Direct deletion of outliers may result in loss of information and is recommended:

Use statistical methods (e.g., interquartile distance for box-and-line plots) to identify outliers and analyze their causes.

If the outliers have special significance (e.g., congestion due to sudden activity), they can be modeled or recorded separately.

General

Main
1. Input data:
- Enter data (Each value represents a time that a student arrive the cafeteria)
2. Distribution fitting:
- Use the distributions (normal, Poisson, exponential, lognormal, gamma) from the scipy.stats module.
- Fit each distribution and estimate its parameters.
3. Model selection criteria:
- AIC (Akaike Information Criterion): a measure of model fit and smoothness.
- BIC (Bayesian Information Criterion): considers the effect of the number of data points on model fit.
4. Output results:
- Name of the fitted distribution, parameters, log-likelihood values, AIC and BIC.
- The optimal distribution is selected by the smaller AIC/BIC value.

Extra

Addition of distributional tests:

Kolmogorov-Smirnov (K-S) test: checks that the data fits the fitted distribution.

Anderson-Darling test: more sensitive to fitting the tails of the distribution.

Multi-distribution combination test:

If a single distribution is not fitted well, try a Mixed Distribution Model (e.g. Mixed Gaussian) to capture complex patterns.
The probability density function (PDF) of the Gaussian Mixture Model is:

$p (x) = k = 1 \sum K π_{k} N (x ∣ μ_{k}, σ_{k}^{2})$
Where:

$K$ is the number of Gaussian distributions (i.e., the number of mixture components).

$π_{k}$ is the weight of the $k$ -th Gaussian distribution, satisfying $\sum_{k = 1}^{K} π_{k} = 1$ .

$N (x ∣ μ_{k}, σ_{k}^{2})$ represents the $k$ -th Gaussian distribution, with mean $μ_{k}$ and variance $σ_{k}^{2}$ .
import numpy as np
import scipy.stats as stats
import pandas as pd
 
arrival_times = list(map(float, input("Please Enter the values（Split by Space）：").split()))
 
# This part can be modified after the data is collected
distributions = {
    "Normal": stats.norm,
    "Poisson": stats.poisson,
    "Exponential": stats.expon,
    "Lognormal": stats.lognorm,
    "Gamma": stats.gamma
}
 
results = []
 
for name, distribution in distributions.items():
    try:
        params = distribution.fit(arrival_times)
        
        pdf = distribution.pdf(arrival_times, *params)
        
        log_likelihood = np.sum(np.log(pdf))
        
        # AIC
        k = len(params) 
        aic = 2 * k - 2 * log_likelihood
        
        # BIC
        n = len(arrival_times) 
        bic = k * np.log(n) - 2 * log_likelihood
 
        results.append({
            "Distribution": name,
            "Parameters": params,
            "Log-Likelihood": log_likelihood,
            "AIC": aic,
            "BIC": bic
        })
    except Exception as e:
        results.append({
            "Distribution": name,
            "Error": str(e)
        })
 
# DataFrame
results_df = pd.DataFrame(results)
import ace_tools as tools; tools.display_dataframe_to_user(name="Distribution Fit Results", dataframe=results_df)
2. Model Establishment
Based on assumptions about the data and usual observations, we assume that the arrival distribution satisfies a certain distribution (e.g., skewed distribution). Most students usually enter the cafeteria right after classes end, and the arrival rate of people decreases steadly over time. We assume that the service process is a poisson process, i.e., each service time is completely random (not depend on any previous situation).
So the $G / M / c$ model should be used. (Note: the actual model selection needs to be based on actual data)

In the $G / M / c$ model, the arrival process follows a General distribution, the service time follows a Poisson process, and there are $c$ parallel service windows.

Basic Parameters

Symbol Explanation Unit
$λ$ Average arrival rate of customers $p eo pl e / s$
$μ$ Average service rate of a single service window $ser v i ces / s$
$c$ Number of service windows $N u mb er$
$ρ$ system utilization rate $/$
$C_{a}$ Coefficient of variation $/$
$W_{q}$ Mean Waiting Time $s$
$L_{q}$ Average number of people in the queue $N u mb er$
$P_{0}$ Probability that the system is empty $/$
Which:
$ρ = \frac{λ}{c μ}$
where $ρ < 1$ is necessary for system stability.

$C_{a}$ : coefficient of variation (ratio of standard deviation to mean) of the arrival time distribution: $C_{a} = \frac{σ _{a}}{1/ λ}$

Standard deviation of arrival times $σ_{a}$

We have a Set of data $[T_{a 1}, T_{a 2}, \dots, T_{an}]$ ，We can estimate $σ_{a}$ by：

$σ_{a} = \frac{1}{n - 1} i = 1 \sum n (T_{ai} - \overset{ˉ}{T_{a}})^{2}$

$\overset{ˉ}{T_{a}} = \frac{1}{n} \sum_{i = 1}^{n} T_{ai}$ is the average of arrival times

Mean Waiting Time $W_{q}$

Using a generalized form of the Pollaczek-Khinchin (P-K) formula, you can compute the average wait time in the queue $W_{q}$ :

$W_{q} = \frac{C _{a}^{2} + 1}{2} \cdot W_{q, M / M / c}$

where $W_{q, M / M / c}$ is the average waiting time of the $M / M / c$ model, Eq:

$W_{q, M / M / c} = \frac{L _{q, M / M / c}}{λ}$
Average number of people in the queue $L_{q}$

And $L_{q, M / M / c}$ (the average number of people in the queue) is based on the Erlang-C formula:

$L_{q, M / M / c} = \frac{P _{0} \cdot \frac{( λ / μ ) ^{c}}{c !} \cdot \frac{c ρ}{( 1 - ρ ) ^{2}}}{( \sum _{k = 0}^{c - 1} \frac{( λ / μ ) ^{k}}{k !} ) + \frac{( λ / μ ) ^{c}}{c !} \cdot \frac{1}{1 - ρ}}$

Revised $L_{q}$ is defined by：

$L_{q} \approx \frac{C _{a}^{2} + 1}{2} \cdot L_{q, M / M / c}$
Probability that the system is empty $P_{0}$

$P_{0}$ : the probability that the system is empty, which can be calculated by the following formula: $P_{0} = (k = 0 \sum c - 1 \frac{( λ / μ ) ^{k}}{k !} + \frac{( λ / μ ) ^{c}}{c !} \cdot \frac{1}{1 - ρ})^{- 1}$

Average system wait time $W$

The average wait time in the system $W$ includes the wait time in the queue $W_{q}$ and the service time:

$W = W_{q} + \frac{1}{μ}$
Average number of people in the system $L$

The average number of people in the system $L$ includes the number of people in the queue $L_{q}$ and the number of people being served:

$L = L_{q} + \frac{λ}{μ}$
Extra

Dynamic modeling: student arrival rates and service times are currently assumed to be fixed distributions, but may vary in real scenarios (e.g., peak hour fluctuations). It is possible to:

Introduce time-varying parameter models (e.g., time segmentation, smoothing functions).

Use Markov chain models to simulate dynamic service window states.

Explore multi-model comparisons:

In addition to $G / M / c$ , try $G / G / c$ or more complex models (e.g., queuing networks) to analyze the overall service system.

Compare model performance and choose the model that best fits the scenario.

3. Model Using
import math
 
def calculate_queue_metrics(mu, lambd, c, sigma_a=None):
    # Step 1: Calculate rho
    rho = lambd / (c * mu)
    if rho >= 1:
        raise ValueError("The system is unstable (rho >= 1). Please adjust input parameters.")
    
    # Step 2: Calculate C_a
    C_a = (sigma_a * lambd) if sigma_a else 1  # Default to 1 if sigma_a is not provided
 
    # Step 3: Calculate P0
    sum_k = sum((lambd / mu)**k / math.factorial(k) for k in range(c))
    P0 = 1 / (sum_k + (lambd / mu)**c / math.factorial(c) / (1 - rho))
    
    # Step 4: Calculate L_q (classical M/M/c)
    L_q_MM_c = (P0 * ((lambd / mu)**c / math.factorial(c)) * (c * rho / (1 - rho)**2)) / (
        sum_k + (lambd / mu)**c / math.factorial(c) / (1 - rho)
    )
    
    # Step 5: Adjust L_q for generalized case
    L_q = (C_a**2 + 1) / 2 * L_q_MM_c
    
    # Step 6: Calculate W_q
    W_q = L_q / lambd
    
    # Step 7: Calculate W
    W = W_q + 1 / mu
    
    # Step 8: Calculate L
    L = L_q + lambd / mu
    
    return {"W": W, "L": L, "P0": P0}
 
# Example usage
mu = 5  # Service rate
lambd = 4  # Arrival rate
c = 3  # Number of servers
sigma_a = 0.1  # Optional, variability of arrival time
 
result = calculate_queue_metrics(mu, lambd, c, sigma_a)
print(result)
FootNote

Use $C_{a}$ (Coefficient of Variation of Arrival Time) to correct the formula of $M / M / c$ , that is, we can get the result of $G / M / c$ .

Outputs

We enter the observed data into the code segment and conclude that

The average arrival time is…
The average number of people in the queue is…

Extra

Enhanced parameter estimation methods: Currently, parameter estimation is mainly based on best fit, but a more rigorous calculation of confidence intervals should be incorporated.

Calculate confidence intervals for parameter estimates using the Bootstrapping method.

Check the stability of the estimates by estimating them separately for different time periods and

Link to original

Symbol	Explanation	Unit
$λ$	Average arrival rate of customers	$p eo pl e / s$
$μ$	Average service rate of a single service window	$ser v i ces / s$
$c$	Number of service windows	$N u mb er$
$ρ$	system utilization rate	$/$
$C_{a}$	Coefficient of variation	$/$
$W_{q}$	Mean Waiting Time	$s$
$L_{q}$	Average number of people in the queue	$N u mb er$
$P_{0}$	Probability that the system is empty	$/$
Which:

The Rationale for the Chosen Method

Limitations of the Chosen Method

Observation

The data for this study came from field observations in our school’s cafeteria, and data were collected from December 2 to December 5, 2024, and December 9 to December 12, 2024 (8 days in total). Data collection under each schedule lasted for four days due to the school’s schedule of one cycle every two days. The data primarily covered the third and fourth periods which are lunch periods, and the peak hours. Specifically, it includes the number of students entering/leaving the cafeteria during each period, the service time of each window, and the number of windows.

Methodology

Data collection was performed through the cafeteria’s monitoring system during the daytime hours, which recorded the timestamp of each student’s entry into the cafeteria. The specific time period for data collection was from 11:50 to 13:35 each day. During this time period, the number of students entering the cafeteria was recorded by manual count, and the service time of each service window was recorded by simple observations. The mean value of the window service time is the average value obtained by several manual timings.

Data preprocessing

During the data cleaning process, all the collected data met the preset criteria, so no data was eliminated. During the data preprocessing stage, timestamps were converted to relative times with respect to peak hours and grouped for counting at 2-minute intervals during peak hours. For off-peak hours, group counts were performed at 5-minute intervals.

My Vault

Explorer

MAR 25

Research Question

Approach

Design

Proposed Method

Research Method Dec5 Ed.

Abstract

General

Modeling

1. Data Analysis

Enhanced outlier handling methods

General

Extra

2. Model Establishment

So the $G / M / c$ model should be used. (Note: the actual model selection needs to be based on actual data)

Basic Parameters

Standard deviation of arrival times $σ_{a}$

Mean Waiting Time $W_{q}$

Average number of people in the queue $L_{q}$

Probability that the system is empty $P_{0}$

Average system wait time $W$

Average number of people in the system $L$

Extra

3. Model Using

FootNote

Outputs

Extra

The Rationale for the Chosen Method

Limitations of the Chosen Method

Observation

Methodology

Data preprocessing

Graph View

Table of Contents

Backlinks

My Vault

Explorer

MAR 25

Research Question

Approach

Design

Proposed Method

Research Method Dec5 Ed.

Abstract

General

Modeling

1. Data Analysis

Enhanced outlier handling methods

General

Extra

2. Model Establishment

So the G/M/c model should be used. (Note: the actual model selection needs to be based on actual data)

Basic Parameters

Standard deviation of arrival times σa​

Mean Waiting Time Wq​

Average number of people in the queue Lq​

Probability that the system is empty P0​

Average system wait time W

Average number of people in the system L

Extra

3. Model Using

FootNote

Outputs

Extra

The Rationale for the Chosen Method

Limitations of the Chosen Method

Observation

Methodology

Data preprocessing

Graph View

Table of Contents

Backlinks

So the $G / M / c$ model should be used. (Note: the actual model selection needs to be based on actual data)

Standard deviation of arrival times $σ_{a}$

Mean Waiting Time $W_{q}$

Average number of people in the queue $L_{q}$

Probability that the system is empty $P_{0}$

Average system wait time $W$

Average number of people in the system $L$