Load the sample data.
This simulated data is from a manufacturing company that operates 50 factories across the world, with each factory running a batch process to create a finished product. The company wants to decrease the number of defects in each batch, so it developed a new manufacturing process. To test the effectiveness of the new process, the company selected 20 of its factories at random to participate in an experiment: Ten factories implemented the new process, while the other ten continued to run the old process. In each of the 20 factories, the company ran five batches (for a total of 100 batches) and recorded the following data:
Flag to indicate whether the batch used the new process (newprocess
)
Processing time for each batch, in hours (time
)
Temperature of the batch, in degrees Celsius (temp
)
Categorical variable indicating the supplier (A
,B
, orC
) of the chemical used in the batch (supplier
)
Number of defects in the batch (defects
)
The data also includestime_dev
andtemp_dev
, which represent the absolute deviation of time and temperature, respectively, from the process standard of 3 hours at 20 degrees Celsius.
Fit a generalized linear mixed-effects model usingnewprocess
,time_dev
,temp_dev
, andsupplier
as fixed-effects predictors. Include a random-effects term for intercept grouped byfactory
, to account for quality differences that might exist due to factory-specific variations. The response variabledefects
has a Poisson distribution, and the appropriate link function for this model is log. Use the Laplace fit method to estimate the coefficients. Specify the dummy variable encoding as'effects'
, so the dummy variable coefficients sum to 0.
The number of defects can be modeled using a Poisson distribution
This corresponds to the generalized linear mixed-effects model
where
is the number of defects observed in the batch produced by factory
during batch
.
is the mean number of defects corresponding to factory
(where
) during batch
(where
).
,
, and
are the measurements for each variable that correspond to factory
during batch
. For example,
indicates whether the batch produced by factory
during batch
used the new process.
and
are dummy variables that use effects (sum-to-zero) coding to indicate whether companyC
orB
, respectively, supplied the process chemicals for the batch produced by factory
during batch
.
is a random-effects intercept for each factory
that accounts for factory-specific variation in quality.
Userandom
to simulate a new response vector from the fitted model.
Display the first 10 rows of the simulated response vector.
ans =10×13 3 1 7 5 8 7 9 5 9
Simulate a new response vector using new input values. Create a new table by copying the first 10 rows ofmfr
intotblnew
.
The first 10 rows ofmfr
include data collected from trials 1 through 5 for factories 1 and 2. Both factories used the old process for all of their trials during the experiment, sonewprocess = 0
for all 10 observations.
Change the value ofnewprocess
to1
for the observations intblnew
.
Simulate new responses using the new input values intblnew
.
ynew2 =10×12 3 5 4 2 2 2 1 2 0