Research method

Abstract

This study employs a mathematical modeling approach.

General

In the mathematical modeling process, we draw on the research of adan_queueing2002 regarding queuing theory, specifically focusing on student arrival distribution between the third and fourth class sessions, the service time distribution at service windows, and the observed number of windows. For 10 Days/2weeks.

Modeling

1. Data Analysis

  1. Arrival data analysis

    Plot the collected data (a person entering the cafeteria at a certain time) as a histogram
    Attempt to fit common distributions (Normal, Poisson, Exponential, Lognormal, Gamma, etc.) to determine the specific distributional form of the arrival and service process.

    The collected data were entered into a computer and analyzed for summary statistics and visual representations across different variables using a statistical system. The frequency distribution of the survey results is shown in Figure XX.
    We reviewed the observed data and removed certain outliers. or:

Enhanced outlier handling methods

Direct deletion of outliers may result in loss of information and is recommended:

  • Use statistical methods (e.g., interquartile distance for box-and-line plots) to identify outliers and analyze their causes.
  • If the outliers have special significance (e.g., congestion due to sudden activity), they can be modeled or recorded separately.

General

Main
1. Input data:
- Enter data (Each value represents a time that a student arrive the cafeteria)
2. Distribution fitting:
- Use the distributions (normal, Poisson, exponential, lognormal, gamma) from the scipy.stats module.
- Fit each distribution and estimate its parameters.
3. Model selection criteria:
- AIC (Akaike Information Criterion): a measure of model fit and smoothness.
- BIC (Bayesian Information Criterion): considers the effect of the number of data points on model fit.
4. Output results:
- Name of the fitted distribution, parameters, log-likelihood values, AIC and BIC.
- The optimal distribution is selected by the smaller AIC/BIC value.

Extra
  • Addition of distributional tests:
    • Kolmogorov-Smirnov (K-S) test: checks that the data fits the fitted distribution.
    • Anderson-Darling test: more sensitive to fitting the tails of the distribution.
  • Multi-distribution combination test:
    • If a single distribution is not fitted well, try a Mixed Distribution Model (e.g. Mixed Gaussian) to capture complex patterns.
      The probability density function (PDF) of the Gaussian Mixture Model is:

Where:

  • is the number of Gaussian distributions (i.e., the number of mixture components).
  • is the weight of the -th Gaussian distribution, satisfying .
  • represents the -th Gaussian distribution, with mean and variance .
import numpy as np
import scipy.stats as stats
import pandas as pd
 
arrival_times = list(map(float, input("Please Enter the values(Split by Space):").split()))
 
# This part can be modified after the data is collected
distributions = {
    "Normal": stats.norm,
    "Poisson": stats.poisson,
    "Exponential": stats.expon,
    "Lognormal": stats.lognorm,
    "Gamma": stats.gamma
}
 
results = []
 
for name, distribution in distributions.items():
    try:
        params = distribution.fit(arrival_times)
        
        pdf = distribution.pdf(arrival_times, *params)
        
        log_likelihood = np.sum(np.log(pdf))
        
        # AIC
        k = len(params) 
        aic = 2 * k - 2 * log_likelihood
        
        # BIC
        n = len(arrival_times) 
        bic = k * np.log(n) - 2 * log_likelihood
 
        results.append({
            "Distribution": name,
            "Parameters": params,
            "Log-Likelihood": log_likelihood,
            "AIC": aic,
            "BIC": bic
        })
    except Exception as e:
        results.append({
            "Distribution": name,
            "Error": str(e)
        })
 
# DataFrame
results_df = pd.DataFrame(results)
import ace_tools as tools; tools.display_dataframe_to_user(name="Distribution Fit Results", dataframe=results_df)

2. Model Establishment

Based on assumptions about the data and usual observations, we assume that the arrival distribution satisfies a certain distribution (e.g., skewed distribution). Most students usually enter the cafeteria right after classes end, and the arrival rate of people decreases steadly over time. We assume that the service process is a poisson process, i.e., each service time is completely random (not depend on any previous situation).
So the model should be used. (Note: the actual model selection needs to be based on actual data)
  • In the model, the arrival process follows a General distribution, the service time follows a Poisson process, and there are parallel service windows.

Basic Parameters

SymbolExplanationUnit
Average arrival rate of customers
Average service rate of a single service window
Number of service windows
system utilization rate
Coefficient of variation
Mean Waiting Time
Average number of people in the queue
Probability that the system is empty
Which:

where is necessary for system stability.

  • : coefficient of variation (ratio of standard deviation to mean) of the arrival time distribution:

Standard deviation of arrival times

  • We have a Set of data ,We can estimate by:
  • is the average of arrival times

Mean Waiting Time

  • Using a generalized form of the Pollaczek-Khinchin (P-K) formula, you can compute the average wait time in the queue :
  • where is the average waiting time of the model, Eq:

Average number of people in the queue

  • And (the average number of people in the queue) is based on the Erlang-C formula:
  • Revised is defined by:

Probability that the system is empty

  • : the probability that the system is empty, which can be calculated by the following formula:

Average system wait time

  • The average wait time in the system includes the wait time in the queue and the service time:

Average number of people in the system

  • The average number of people in the system includes the number of people in the queue and the number of people being served:
Extra
  • Dynamic modeling: student arrival rates and service times are currently assumed to be fixed distributions, but may vary in real scenarios (e.g., peak hour fluctuations). It is possible to:
    • Introduce time-varying parameter models (e.g., time segmentation, smoothing functions).
    • Use Markov chain models to simulate dynamic service window states.
  • Explore multi-model comparisons:
    • In addition to , try or more complex models (e.g., queuing networks) to analyze the overall service system.
    • Compare model performance and choose the model that best fits the scenario.

3. Model Using

import math
 
def calculate_queue_metrics(mu, lambd, c, sigma_a=None):
    # Step 1: Calculate rho
    rho = lambd / (c * mu)
    if rho >= 1:
        raise ValueError("The system is unstable (rho >= 1). Please adjust input parameters.")
    
    # Step 2: Calculate C_a
    C_a = (sigma_a * lambd) if sigma_a else 1  # Default to 1 if sigma_a is not provided
 
    # Step 3: Calculate P0
    sum_k = sum((lambd / mu)**k / math.factorial(k) for k in range(c))
    P0 = 1 / (sum_k + (lambd / mu)**c / math.factorial(c) / (1 - rho))
    
    # Step 4: Calculate L_q (classical M/M/c)
    L_q_MM_c = (P0 * ((lambd / mu)**c / math.factorial(c)) * (c * rho / (1 - rho)**2)) / (
        sum_k + (lambd / mu)**c / math.factorial(c) / (1 - rho)
    )
    
    # Step 5: Adjust L_q for generalized case
    L_q = (C_a**2 + 1) / 2 * L_q_MM_c
    
    # Step 6: Calculate W_q
    W_q = L_q / lambd
    
    # Step 7: Calculate W
    W = W_q + 1 / mu
    
    # Step 8: Calculate L
    L = L_q + lambd / mu
    
    return {"W": W, "L": L, "P0": P0}
 
# Example usage
mu = 5  # Service rate
lambd = 4  # Arrival rate
c = 3  # Number of servers
sigma_a = 0.1  # Optional, variability of arrival time
 
result = calculate_queue_metrics(mu, lambd, c, sigma_a)
print(result)
FootNote
  • Use (Coefficient of Variation of Arrival Time) to correct the formula of , that is, we can get the result of .

Outputs

  • We enter the observed data into the code segment and conclude that

The average arrival time is…
The average number of people in the queue is…

Extra
  • Enhanced parameter estimation methods: Currently, parameter estimation is mainly based on best fit, but a more rigorous calculation of confidence intervals should be incorporated.
    • Calculate confidence intervals for parameter estimates using the Bootstrapping method.
    • Check the stability of the estimates by estimating them separately for different time periods and