Progress of chronic kidney disease and associated predictors among patients under treatment at Gambi and Felege-Hiwote hospitals

Table of Contents

Study area and population and design

The current study was conducted in the capital city of Amhara region, Bahir Dar. Bahir City is located at a distance of 585 kg. from the capital city of Ethiopia (Addis Ababa). Bahir Dar City is surrounded by the initial place of Blue Nile river, what is known as Lake Tana. A hospital-based retrospective study was conducted among CKD outpatients attending at Felege Hiwot Referral Hospital (FHRH) and Gambi Teaching Hospital (GTH), Ethiopia, between September 2017 and January 2021. The two hospitals are popular referral hospitals and have a section served for treatment of normal range CKD(> 60) and CKD rage (15–60 ml/min/1.73m²).

Sampling procedures and sample size determination

During the study period, about 1723 (950 at Felege Hiwot and the rest 773 (at Gambi) CKD patients attended at the two hospitals. In sampling procedures, first the entire sample that should be taken from the two hospitals should be determined adopting Cochran’s formula. Cochran (1977) developed a formula to calculate a representative sample for proportions as¹⁹;

$$\:n\:=\frac{{z}^{2}pq}{{e}^{2}}$$

where $\:n$ is the sample size, z is the selected critical value of desired confidence level, p is the estimated proportion of an attribute that is present in the population, q = p − 1 and e is the desired level of precision.

Suppose we want to calculate a sample size whose degree of variability is not known. Assuming the maximum variability, which is equal to 50% ( p = 0.5) and taking 95% confidence level with ± 5% precision, the calculation for required sample size will be as follows²⁰; p = 0.5 and hence q = 1-0.5 = 0.5; e = 0.05; z = 1.96. So that $\:n$ $\:\approx\:$ 343.

After determined the entire sample size, the proportional allocation random sampling technique was conducted to take proportional samples from the two hospitals. To compute the sample size in each hospital, stratified random sapling technique was conducted as indicated below;

$\:{n}_{i}$ =$\:\frac{{N}_{i}}{N}$ *n, where i = 1,2, $\:{N}_{i}$ is the entire population in the i^th strata/group and $\:{N}_{1}$ +$\:{N}_{2}$ =n=1456, $\:{n}_{i}$ is the sample size at the ith strata/group²¹. Hence, sample size at Felege-Hiwot hospital, $\:{n}_{1}$ = $\:\frac{950}{1723}$ *343 = 189. The remaining samples (154) were selected at Gambi teaching hospital that can be computed using the same formula. After determing the sample sizes for each hospital, a random sample of charts of each patient arranged based on their chart number, a systematic random sampling technique was used with intervals for Felege Hiwot = $\:\frac{950}{189}$ = 5 and for Gambi= $\:\frac{773}{154}$ = 5. Generally, about 189 random samples from Felege-Hiwot and 154 random sample of charts from Gambi teaching hospital were selected using systematic random sampling technique with equal interval of 5.

Inclusion criteria: CKD patients who had at least two visits in the two hospitals mentioned above were considered as potential candidate to be included in the current study.

Variables under study

Response variable

The response variable in the current study was the status of CKD. CKD is evaluated using two simple tests namely a blood test known as the estimated glomerular filtration rate (eGFR) ans urine test known as the urine albumin-creatinine ratio (uACR) in which both test need to have a clear picture of your kidney health. In the current study, estimation of the glomerular filtration rate (GFR) used considering the three categories namely: “normal range (if GFR > 60)”, “CKD range (if GFR 15–60)” and “end-stage(if GFR < 15 ml/min/1.73 m²)²² ”. Kidney patients at the third stage (end-stage) were forced to kidney transplantation or hemodialysis and the data for such patients were not available during data collection for this study. Hence, among the three categories, only the first two (Normal and CKD ranges) were considered in this study. Therefore, the response variable was coded as 1 for “normal range” and 0 for “CKD range”²³. Among the different formula used for the calculation of status of CKD, the Chronic Kidney Disease Epidemiology Collaboration (CKD-EPI) equation was used in this study²⁴.

Predictor variables

The predictor variables in this study were: age (in years), sex (male, female), residential area (urban, rural), use of salt (yes, no), hypertension (HTN) (yes, no), diabetes mellitus type-I (DM-I) (yes, no), diabetes mellitus type-II patients (DM-II) (yes, no), serum creatinine (SCr) in mg/dl, blood urea nitrogen (BUN) in mg/dl, hematocrit (HCT) in mg/dl, and urinary protein (Up) measured in mg/dl. The categories of predictor variables were based on previous studies²⁵.

Statistical analysis

A first-order Markov chain estimation technique models a stochastic process where the probability of transitioning to the next state depends only on the current state. The process is typically represented by a transition matrix, where each cell (P_ij) indicates the probability of transitioning from state i to state j.

Markov chains are usually used in modeling many practical problems. They are also effective in modeling repeated measures on the same individual or patient. In this study, a first order Markov chain model was used to analyze and predict the status of CKD. The results from the previous studies show that the performance and effectiveness of the Markov chain model to predict the repeated measures is very well²⁶.

The chain of successive events is called a Markov process, which is discrete/categories when the event is happened at fixed times. A discrete Markov process is well applicable in case of failure rate λ_i of a component i is constant and probability of functioning reliably at time t, P_i(t). For a small time increment Δt it means that the probability change of i being still reliable is:

${P_{\text{i}}}(t+Dt)={P_{\text{i}}}\left( t \right){e^{ – \lambda {\text{iDt}}}} \approx {P_{\text{i}}}\left( t \right)({\text{1}} – {\lambda _{\text{i}}}Dt).$

If all λ_i are the same the process is called homogeneous, but usually for a system this will not be the case and is the process called semi-Markov. The equation describing the probability of system states P of a homogeneous process as a function of time t is:

${P_j}t+\Delta t=\sum\nolimits_{{k \ne j}} {{P_k}t{\lambda _{{\text{kj}}}}\Delta t+{P_j}t1} – \sum\nolimits_{{k \ne j}} {{\lambda _{{\text{jk}}}}\Delta ,}$

in which λ_kj is the rate of transition of the system from state k to state j and λ_jk back to k.

Random effects

Individual random effects $\:{\:b}_{t}$~ N(0, σ²) were incorporated as an additional term to the linear predictor in the above continuous regression model, leading to the regression model with a random intercept²⁷.

Intercept only model In the current study, it was assumed that the average normal status in CKD in outpatients as a constant from one visit to another. The model in this regard is written as;

$$\:logit\left({\pi\:}_{it}\right)=\text{l}\text{o}\text{g}\left(\frac{{\pi\:}_{it}}{1-{\pi\:}_{it}}\right)=\text{log}\left(\frac{p\:\left(y=1|{x}_{it},\beta\:\right)}{1-p\:\left(y=1|{x}_{it},\beta\:\right)}\right)={{\upbeta\:}}_{0}{\:+\:b}_{t0\:}$$

where logit(π_it) is the odds of having normal status, β₀ and b_t0 are the intercepts for fixed and random effects, respectively, for the value of p ($\:{Y}_{it}=1)\:$(i.e., probability of having normal status in outpatients).

Random intercept model: The random intercept model was assumed as subject-specific changes in the average chance of being classified as normal. The model in this case is written as follows, considering the variation assumed to be constant.

$$\:\:logit\left({\pi\:}_{it}\right)=\text{\:log}\left(\frac{{\pi\:}_{it}}{1-{\pi\:}_{it}}\right)=\text{\:log}\left(\frac{p\:\left(y=1|{x}_{it},\beta\:\right)}{1-p\:\left(y=1|{x}_{it},\beta\:\right)}\right)=\sum\:_{i=0}^{k}{\beta\:}_{i}{x}_{it}{+\:b}_{t0}$$

where $\:logit\left({\pi\:}_{it}\right)$ the log odds of normal status, $\:{x}_{it}$ is the design matrix of fixed effect variables; $\:\beta\:$ are the vector of regression coefficients; and$\:{\:b}_{t0}$ is the random intercept.

Handling missing observations

There are three missing data mechanisms²⁸. The first one is missing completely at random (MCAR), which refers to missingness such that the missing values are independent of both the unobserved and the observed values of the variable of interest. The second is missing at random (MAR) which can be occurred when missing values depend on only the observed values of the dependent variable, but are independent of unobserved values of the same variable. The third mechanism of missingness is referred to as missing not at random (MNAR), which is neither MCAR nor MAR²⁸.

Under MCAR mechanism, the probability of an observation being missing is independent of the responses. Therefore, the probability density of $\:{k}_{i}$ is²⁹.

${\text{f}}({k_i}/{y_i},{w_i},b)={\text{f}}({k_i}/{w_i},b)$

Under MAR mechanism, the probability of data missing is conditionally independent of the unobserved data. That is³⁰,

${\text{f}}({k_i}/{y_i},{w_i},b)={\text{f}}({k_i}/y_{i}^{o}{w_i},b)$

Therefore, the distribution of the observed data ($y_{i}^{o}$ ) can be partitioned as:

${\text{f}}(y_{i}^{o},{k_i}|{{\mathbf{X}}_i},{{\mathbf{Z}}_i},{w_i},\theta ,{\mathbf{b}})={\text{f}}(y_{i}^{o}|{{\mathbf{X}}_i},{{\mathbf{Z}}_i},\theta ){\text{f}}({k_i}|y_{i}^{o}{w_i},{\mathbf{b}})$

where $\:{\:\mathbf{X}}_{i}$, $\:{\mathbf{Z}}_{i}\:$and $\:{\mathbf{W}}_{\text{i}}$ are design matrices for fixed effects, random effects and missing data process, respectively; $\:{\uptheta\:}\:\text{a}\text{n}\text{d}\:\mathbf{b}$ are vectors that parameterize the joint distribution. The value of $\:\varvec{b}$ describes the measurement and missingness process.

Since individuals with missing values($y_{i}^{m}$) are not included as samples for data analysis, the complete case provides unbiased estimates; and this time, likelihood-based methods yield valid estimates³¹.

In the MNAR case, neither MCAR nor MAR holds true and the probability of a measurement being missing depends on unobserved outcomes³². The joint distribution of measurements and the missingness process is written as³³:

${\text{f}}(y_{i}^{o},{k_i}|{{\mathbf{X}}_i},{{\mathbf{Z}}_i},{{\mathbf{w}}_i},{\mathbf{b}})=\int {{\text{f}}({{\mathbf{y}}_i}|{{\mathbf{X}}_i},{{\mathbf{Z}}_i},{{\mathbf{w}}_i}){\text{f}}({k_i}|{{\mathbf{y}}_i},{\mathbf{b}})d{\mathbf{y}}_{i}^{m}}$ and it is impossible to have simplified form of this joint distribution.

Check for missing completely at random (MCAR) in longitudinal data analysis

If the missing data is MCAR, the means of the two data sets, obtained by categorizing the data by variable $\:{k}_{ij}$ will not be differed³⁴.

The logistic regression model can also be used to check the MCAR assumption³⁵. Let,

${p_{ij}}=P({k_{ij}}=0|{y_{i(j – 1)}})$, then the logistic regression model that can be fitted to check the MCAR assumption is:

$\begin{gathered} \ln \left( {\frac{{{p_{ij}}}}{{1 – {p_{ij}}}}} \right)={\beta _0}+{\beta _1}{y_{i(j – 1)}}+{\mathbf{X}}_{i}^{T}{\beta _2}+\left( {{y_{i(j – 1)}} \times {\mathbf{X}}_{i}^{T}} \right){\beta _3}+{\varepsilon _{ij}} \hfill \\ {\text{with}}\;{p_{ij}}=\left\{ {\begin{array}{*{20}{l}} 1&{{\text{if}}\;{y_{ij}}\;{\text{is}}\;{\text{observed}}} \\ 0&{{\text{if}}\;{\text{otherwise}}} \end{array}} \right. \hfill \\ \end{gathered}$

Where $\:{\beta\:}_{2}$ and$\:{\:\beta\:}_{3}$ are vectors of regression coefficients associated with covariate $\:{X}_{i}^{T}$ and the interaction of $\:{X}_{i}^{T}$ with$\:{y}_{i(j-1)}$ respectively. Under MCAR, $\:{\beta\:}_{1}$= 0 and $\:{\beta\:}_{3}$=0. This indicates that dropouts are random and independent of the response.

To assess the trend of missigness in the longitudinal trajectory of our data, two-way interaction of the previous result with other covariates were included in the model one at a time and their significance were also tested³⁶ .

Hence, a logistic regression was conducted to assess whether or not missing values were affected by previous results; and this indicated that dropouts were independent of the previous outcomes (χ²₁ = 0.2018, p = 0.864). Hence, dropout patients did not have reasons from their previous visits; therefore dropout trend/pattern was Missed Completely at Random (MCAR)³⁷. and Missing observations were managed using multiple imputations techniques conducted for 20 imputations on the variables involved in the model³⁸.

Model selection

Essentially, alternative models for the same dataset exist by varying the dependency between two successive visits by each individual in the dataset. The models were: a null model (Null), a full model with independence of visits (ind), an independence with random intercept (indR), a first order Markov chain (MC1), and a first order Markov chain with random intercept (MC1R). The receiver operating characteristic curves (ROCs) were employed in the study to identify the best fit model from those Markov dependencies. Therefore, the model with the highest values of AUC (area under the ROC curve) was considered as the best fit to the model. The data were coded and cleaned using SPSS version 26 and analyzed by R version 4.1.1 with the “bild” package²⁷. Finally, a statistical significance test was taken at a 5% level of significance.

Model adequacy

Model adequacy in binary classification refers to evaluating how well a model fits the data and makes accurate predictions. It’s crucial to assess model adequacy to ensure that inferences drawn from the model are reliable and avoid misleading conclusions³⁹. In the current study, goodness-of-fit and model adequacy was assessed over the constellation of fitted values determined by the covariate patterns in the model, not the total collection of covariates⁴⁰. Overall assessment of fit was conducted using a combination of Likelihood Ratio Test and ROC curve analysis for adequacy of the fitted model⁴¹. Previous study also illustrated concisely that the fitting process of binary logistic regression model under standard assumptions had the predictive ability of the model under different aspects⁴².

Ethics procedures for the current study

For the current study, Ethical clearance was obtained at the office of Research and Community engagement vice president, Bahir Dar university, Ethiopia with reference number: RCS/1412/2017. Hence, all the methods were performed in accordance with the relevant guidelines and regulations. The secondary data were obtained with legal ethical clearance given from the university’s vice president for research and community engagement.

link

Progress of chronic kidney disease and associated predictors among patients under treatment at Gambi and Felege-Hiwote hospitals

Study area and population and design

Sampling procedures and sample size determination

Variables under study

Response variable

Predictor variables

Statistical analysis

Random effects

Handling missing observations

Check for missing completely at random (MCAR) in longitudinal data analysis

Model selection

Model adequacy

Ethics procedures for the current study

Istios Health Expands Into Nephrology With Three Practice Partnerships

Kidney care: the economics of innovation

Undiagnosed diabetes and kidney disease raise heart risk

Leave a Reply Cancel reply

Bone and Joint Health Supplements Market Size to Hit USD 9.40 Bn by 2035

Mass cancellations of orthopedic surgeries likely within weeks over compensation dispute, doctors say

N.W.T. leaders say Ottawa is underfunding Indigenous health-care program by millions

Istios Health Expands Into Nephrology With Three Practice Partnerships

Could dentists help Europe beat cervical cancer?

Study area and population and design

Sampling procedures and sample size determination

Variables under study

Response variable

Predictor variables

Statistical analysis

Random effects

Handling missing observations

Check for missing completely at random (MCAR) in longitudinal data analysis

Model selection

Model adequacy

Ethics procedures for the current study

More Stories

Leave a Reply Cancel reply

You may have missed