false
Catalog
Statistics for Clinicians - Nov. 2020 Webinar
Statistics for Clinicians
Statistics for Clinicians
Back to course
[Please upgrade your browser to play this video content]
Video Transcription
Well welcome to today's webinar. I'm Dr. Heidi Harvey, the moderator for today's webinar. But before we begin, I'd like to share that we're going to be taking questions at the end, but you can submit them anytime by typing them into the Q&A section located at the bottom of the event window. So try to use Q&A rather than chat. Today's webinar is entitled Statistics for Clinicians and being presented by Dr. Jennifer Wu. So Dr. Wu is originally from Frederick, Maryland. She earned her BA in biology from Harvard University and her MD from the University of California at San Francisco. She then trained in obstetrics and gynecology at Brigham and Women's Hospital and Mass General Hospital in Boston, then completed fellowship in FPMRS at the University of North Carolina at Chapel Hill. Concurrently, she received her master's in public health and epidemiology at the UNC School of Public Health. From 2007 to 2012, Dr. Wu served on the faculty at Duke University and then in 2013, she rejoined the FPMRS division at UNC Chapel Hill and was appointed division chief in 2018. Then from July 2019 to April 2020, she served as the interim chair of the department of OBG plan and she now serves as a senior vice chair of the department. She remains actively engaged in clinical trials and multidisciplinary research in pelvic floor disorders with collaborators in epidemiology, gastroenterology, geriatrics, and nursing. Thank you. Well, thank you so much for the introduction, Dr. Harvey. I really appreciate it and thank you to the 54 participants. I can't see your faces because of the webinar format, but it's great to have you here. The title of this webinar is called Statistics for Clinicians, and if you are able to have sort of a piece of paper and a pen or pencil, there are going to be questions scattered throughout the talk or discussion today, and it might be helpful for you to just kind of jot down what you think the answers are going to be. Typically, if I were doing this in person, I'd be able to sort of, you know, do something more interactive and have you guys respond, but since it's sort of this webinar format, we're just going to have to go with, you know, you jotting it down, and then I think Dr. Harvey is going to be able to see the questions that come up. I can't see those, I think, right? I can only see the bat, so, you know, I'll lean on Dr. Harvey to keep me on task or if there are questions that come up, but there'll be questions scattered throughout just so you can sort of content knowledge, all right? Okay, we'll get started. Let's see. I have no disclosures. So, the objectives of this talk are to understand the different types of research data, describe commonly used statistics, and then by the end, you should be able to determine what statistical tests should you use to analyze your data. So, again, this is going to be a hopefully very practical talk, right? So, let's go over some statistical concepts. So, let's first talk about what are the different types of data. So, there's categorical, or also known as nominal data. So, examples are male, female, alive or dead, insurance status, you know, commercial, Medicaid, Medicare. There's ordinal data. So, that's ordered or ranked data. So, you might have an example of cancer stages. You know that stage two is worse than stage one and better than stage three, but it's not really clear sort of by how much the intervals are different in terms of these stages. So, you don't know how much these stages are different, or maybe like functional status, low, moderate, high. And then you can also have continuous, also known as interval data. So, those are data with equal distances between each of the values. So, examples are age, BMI, and cost. So, the first question is, let's say you're doing a study of 200 women who underwent a new procedure for pelvic organ prolapse. In describing your study population, state whether the following characteristics are either continuous, ordinal, or categorical. Okay, so these are the six different variables that you have in your study, and you want to identify what type of data each of these variables are. I'm going to give you a moment here. And Dr. Harvey, can you see the participants so that you can get a sense of when people are kind of looking up and they're done answering questions or not? Not right now, I will try. Okay. No, we can't see attendees. We can't see attendees, okay. So, I will just guesstimate to the best of my knowledge how long it is taking you all to answer these questions. Okay, so let's go through the answers here. And then again, if you want to put in the chat, let's say I'm going too fast and I'm not giving people enough time, please let me know just so I can slow it down and give people enough time to really kind of put down your guesses. I think if you force yourself to, you know, commit to an answer, it helps you kind of learn the material better. Okay, so age. That was an example of continuous data, right? Equal distances between each of the values. Race. That would be categorical. White, African American, Asian. BMI is similar to age, right? Continuous data. Then you have POPQ stage. That's in an order, right? So, stage two is worse than stage one. Stage four is worse than stage three. So, that's an ordered data. So, ordinal. And then postmenopausal, I gave you a clue. It's yes or no. So, that's categorical. And then prior hysterectomy is also categorical. You've either had a prior hysterectomy or you've not had a prior hysterectomy. All right. And so, when you're thinking about your data, again, this is sort of really practical. And you need to describe your data, right? You need to figure out, well, what's the best way to describe your data? And we commonly use these different measures of central tendency, right? So, a common one we all know is mean, which is also like the average. And we use mean for describing our continuous data, like age or BMI. And then also there's median. So, that's the value with half the subjects below and half the subjects above this value. And we often will use this for ordinal data or sometimes skew data, which means it's not normally distributed. And we'll go through a little bit of this in a moment. And then mode is often the most frequently occurring category. And that we might use for categorical data, although we commonly use kind of proportions for that. All right. So, how would you best describe your data? And I commonly think of this as like, you know, when you're putting together your table one, right? So, the table one is usually how to describe your study population. And for each of the variables, you want to kind of give some essence about your data, right? It's often, again, one of these measures of central tendency. So, for continuous data, again, examples are age and BMI. You will often report the mean. And then you usually will add the standard deviation. For skewed or ordinal data, so skewed is, again, not normally distributed, you would often report the median, which is like the 50th percentile. And then often with that, you'll put the IQR or the interquartile range. And that's usually the 25th to the 75th percentile. And then for your categorical data, we often report like the number and the percent. So, you know, the proportion, if we use race as an example, the proportion that are Caucasian, African American, Asian, other, et cetera. And here is, again, just to remind you about distributions. And the reason this is important is this will help us figure out, again, what kind of tests you want to use to analyze our data. So, we often will sort of refer to the normal distribution or like the bell-shaped curve. And you notice in the normal distribution, the median, the mean, the median, the mode are all sort of in the middle, the same value. And then you can have skewed data. And there's an example here of negatively skewed or positively skewed. But you can see here, if you report the mean and not the median, you kind of get different values depending on how your data is skewed. So, we often will use median when we have skewed data. And then just to remind you again about the normal distribution, many statistical tests that we're going to think about assume the normal distribution. And we think about, again, 1.96 standard deviations or plus or minus almost two standard deviations encompasses 95% of the values of whatever we're trying to measure. Okay. So, why does it matter kind of what the distribution is? Like, why do we care about if it's normal distribution or not? Well, that really helps us figure out, again, what kind of statistics we're going to use to analyze our data. Okay. So, we're going to kind of break it down to kind of parametric statistics and nonparametric statistics. Parametric statistics are the one that you use for data that are normally distributed. Okay. It's commonly for continuous data. And again, data that's normally distributed. Or also, if you have a larger sample size. And it doesn't have to be super large. I mean, even just like over 30, often you can get away with sort of the parametric statistics. And those are the ones that we're the most familiar with. Okay. Then there's the nonparametric statistics. And we're going to spend a lot of time going through these two kinds of statistics. And the nonparametric statistics are going to be for ordinal data. Or if your data is skewed. Again, either negatively or positively. Or if you have a small sample size. And these tests have less power. So, if you're not sure, you can use a nonparametric test. And if it's statistically significant with a nonparametric test, it will be significant with your parametric statistic. So, again, just think about if you have a normal distribution, a large sample size, you're using parametric. And that's what we commonly are more familiar with. But again, if you're not sure, you have a smaller data set, skewed data, you can start with the nonparametric statistics. Okay. So, let's actually get to the statistical tests. Again, this is going to be a really practical approach of how can I figure out what test I'm going to use to analyze my data. Okay. So, these are sort of the four things you want to think about in picking a test. And then we're going to go through a table. And I think by the end, you're going to feel a lot more comfortable figuring out what test you should use to analyze your data. Okay. So, the first thing is what kind of data are you dealing with? Is it continuous? Is it ordinal? Or is it categorical? And then you want to figure out that question again of parametric versus not parametric. The next thing is to figure out how many groups you're dealing with. How many groups are you comparing? Are there two groups? Are there three groups? And the last question is, is the data paired or not paired? Okay. And then we're going to go through this in a lot of detail with a lot of repetition and a lot of questions. Okay. So, the first thing is in this table, this is sort of like my big cheat sheet. Okay. So, if you were going to remember one thing, let's say if some of you guys are ready, you know, to take your boards or you're working on your thesis, this is the key table to like take away with you. Right. Like if you're going to print up one thing from this talk or file something away, it's this sort of table. Okay. So, in the first column, it breaks it down by the kind of groups you have or if your groups are paired or kind of your goal. Right. And then the next columns are going to be the second column is if you have continuous or sort of parametric tests, what test would that be? The next column, the third column is if you're going to do nonparametric tests, because again, your sample size is small or your data is skewed. And then the last column is what do you do if you have categorical data? Okay. So, again, this is important. All the things we've talked about before are what kind of data are we dealing with and what is the distribution? And that will help us figure out what test to use. Right. So, let's go through some sort of, we're going to go through this almost like row by row. The first thing is when you just have one group, you have a case series of 30 people and you're trying to describe those patients, you know, how are you going to best describe them? And we talked a little bit about this already. Again, this is what you would put into your sort of table one to describe your study population. And so, again, remember mean is used for sort of, again, normal distribution, parametric data, and you do mean plus or minus standard deviation. So, again, this is like age, BMI, I give the example of cost, anything that's going to be continuous. Sometimes your PFDI-20 scores that go from, let's say, 0 to 300. And then you're going to use median for your ordinal data or, again, data that's really skewed. Okay. And you would report the median. And then you usually, again, you do the IQR or the interquartile range, 25th to 75th percentile. Okay. My best example of sort of median data or data that you'd want to use and describe with a median is parity, right? So, parity doesn't go from 1 to 100. Parity goes from, like, 0 to maybe 12, right? Like, you know, people aren't having 20 children. And so, that data is really skewed, right? You have a lot of people having 1 or 2 or 3 children, but very few having 8 to 12 kids, right? So, it's really skewed data. The best thing to use is median. You can't really have, like, a mean, like, 2.6 babies, right, or 2.6 parity. Usually, you either have a median of 2, for example. Okay. So, let's get to some of, like, the other statistics that we're more familiar with when we're analyzing our data. A common thing we do is we're comparing two different groups. So, if the data is continuous, you would use a student's t-test. You have two groups, you're comparing their age, you're going to use a student's t-test. Okay. Non-parametric tests. So, again, if something is skewed data, like my example is parity, and you have two groups, you're going to do the Mann-Whitney-Yew. It's also known as the Wilcoxon Rank Sum. I tend to use Mann-Whitney-Yew as sort of the one I refer to, but I also want you to know the two terminologies. And then, if you have categorical data, race, prior hysterectomy, post-menopausal. Remember, those were the examples from the beginning. We're trying to describe what kind of data we have. You're going to use a chi-square test. Okay. Two groups. Did they have a prior hysterectomy or not? Are they menopausal? Yes or no. You'll use a chi-square. Fisher's exact. I do want to mention that for a moment. That's the test you're going to use basically when you have kind of a rare outcome. Okay. The technical time you would use it is when the expected value in a two-by-two cell is less than five. That probably will not come up, but what you want to think about if you have something really rare, for example, you know, HIV or AIDS in our patient population for us here in North Carolina will be on the rare side. You want to use probably a Fisher's exact. And oftentimes, if you have a statistical program that you're using, it'll kind of tell you, oh, there's, you know, sort of a low value expected in that one cell. You should go ahead and use a Fisher's exact instead of a chi-square. Okay. So let's go through some examples here. So the question is, in a randomized trial of sort of treatment A and treatment B, each with 100 subjects per group, what test would you use to compare the following between these two groups? So you're comparing treatment A to treatment B, and these are the variables that are going to go on, like, let's say your table one. You want to see, did your randomization work well? All right. So the first one is age. Second is, if you've had a prior hysterectomy, BMI and parity. So I'm going to give you a moment, if you would, try to just, like, write those down so we can commit to an answer. Okay. I'm hoping I'm giving you enough time. Again, if you need more time, kind of somehow put that in the chat or the Q&A so we know to give you more time. Okay. And I've kind of talked about all these examples. So hopefully that wasn't too hard to answer. But again, if you can think about, again, that table, right, we have two groups. We know age is continuous, so that's a student's t-test. Status post-hysterectomy, that's a yes or no question. I either had a hysterectomy or didn't, so it's categorical. We're going to use a chi-square for, again, two groups. BMI, similar to age, that's a student's t-test. And then parity, I kind of gave you a clue, right? That's my, like, classic example of something that's skewed data. And we're going to use the Mann-Whitney U, again, also known as the Wilcoxon Rank Sum for that analysis. All right. So going back to this table, we're going to go row by row, okay, two groups. You want to figure out what kind of data are you dealing with. Is it continuous? Is it categorical? Is it skewed or ordinal? So you want to use a non-parametric test and it kind of flows from this table. Okay. So again, these are probably the most common tests we see because oftentimes our studies are comparing two different groups. And so these are the tests that you're going to use more and more often. Okay. So let's say we're going to now do three groups. You're doing a randomized trial of treatment A, treatment B, and treatment C, and they all have a good number of patients in their group. So they've got 100-ish subjects per group. What test are you going to use to compare the following between groups? And the options are a one-way ANOVA, a Kruskal-Wallis, and a Chi-Square. This does kind of presume that you already memorized my table, right? But then think about what you can go through is you've got three groups and you have these characteristics, and we've already talked about what kind of data are they, right? So then you need to try to remember, what do I do for continuous data? What do I do for ordinal data or skewed data? And what do I do for categories? Okay. So as a reminder, this is kind of what the three-group analysis looks like. Okay. So if you've got three groups, you want to say, you know, age is continuous. You know, we've talked about that several times. And then you just have to say, okay, what is the test I'm going to use for my continuous data? Oh, it's a one-way ANOVA. We talked about prior hysterectomy being categorical. You either did or did not have a hysterectomy in the past. That still remains chi-square. So that's kind of easy. It's chi-square, whether it's two groups, chi-square if it's three groups. And then parity, again, my classic example of skewed data where you're going to use a non-parametric test, that's going to be the Crucible-Wallace. Okay. So that's kind of the, again, most common things that we're going to do are two groups in the analysis, or again, we're going to do three groups, and then you kind of want to just look at that table, you know what kind of data you have, and then figure it out based on the table. Okay. So now we're kind of going to sort of regression. So it's kind of a different way of thinking. What we're doing is we're trying to predict, let's say, one variable from another. We're going to assess the impact of independent variables on an outcome of interest. Okay. And the way we figure out what kind of regression we're going to do, and there's a whole bunch of different kind of regression analyses, but we're going to just simplify it to linear or logistic. Logistic is probably the most common one that we see in many of our studies. Okay. The way you figure out if you're going to do linear or logistic regression is based on the outcome, what you're trying to predict. Okay. What we would call the dependent variable. And so that's kind of highlighted in blue. I'm like, think about your outcome. Okay. Once you figure out what kind of outcome you have, then you're going to be able to figure out what kind of regression you're going to do. So you're going to do linear regression if your outcome, the dependent variable, is continuous. I'm trying to predict PFDI-20 scores. I'm trying to predict blood pressure assessments. I'm trying to predict cost, something that's continuous. Okay. And I'm trying to figure out if any of these variables in my study are going to impact my outcome of something that's continuous, you do linear regression. Logistic regression, which again is more commonly done probably in our field and in many, is when you have an outcome that's a dichotomous categorical variable. Basically a yes or no. So prolapse, yes or no. Recurrent incontinence, yes or no. Death, yes or no. Anything that's sort of yes or no, like a dichotomous yes or no outcome, then you're going to use logistic regression. I'm trying to predict the impact of age on prolapse, parity on prolapse, smoking on prolapse. You either have prolapse or you don't have prolapse, then you're going to use logistic regression. Okay. So the key thing I feel like where people kind of get mixed up is that they forget to think about the outcome and that's the way you're going to figure out what kind of regression you're going to do. All right. So let's go, I think, to some questions. What would you use to assess objective cure, yes or no, after surgery B compared to surgery A? And then I can adjust for all these variables. I'm going to adjust for differences in age and race and BMI? Would I use linear, logistic, or this Cox proportional hazards regression, which we'll talk more about later? So again, linear, logistic, or Cox proportional hazard regression. Think about the outcome we're trying to predict. Okay, so it should be logistic regression, okay? And remember, we're trying to predict that dependent variable, the outcome is yes or no cure, with surgery B better than surgery A, or A better than B in terms of cure. And I'm adjusting for differences in these two groups, whether it's differences in age, race, BMI, smoking status, diabetes, whatever it might be, I can adjust for anything. But the key thing is my outcome is, do you have cure or not, however I define cure. Because the cure is the outcome, the dependent variable, and it's a yes or no, then I use logistic regression. Okay, so the next question, how would you compare improvement in OAB based on the OAB questionnaire, let's say a questionnaire scored from zero to 100, between two different neurostimulation devices, while adjusting for age, BMI, and comorbidities? Should I use linear, logistic, or Cox proportional hazards model, or regression? So remember, we're thinking about the outcome. What kind of regression should I do? Okay, so in this case, the regression would be a linear regression, okay? Because the outcome is continuous, I gave you a clue. I said, we're using this OAB questionnaire, the scores go from zero to 100, that's a continuous outcome. So again, for that, we're going to use linear regression, and we can adjust for, again, whatever other variables we want to, but the key thing to figure out what kind of regression you're going to do is based on the outcome, the dependent variable. All right, another question. I'm sorry, I wish I could see all your faces so I can know when people are done answering questions. It's hard to do a webinar without any feedback. Anyway, okay, you conduct a logistic regression analysis to evaluate the association of obesity, yes or no, with MI, adjusting for age and race. Okay, so I'm looking for the association of obesity, yes or no, with MI, adjusting for age and race. Which of the following measures of association would be appropriate for this type of analysis, and is statistically significant at this kind of p-value, p less than 0.5? So I'm looking for, again, I'm doing a logistic regression analysis, and what I'm looking for is what would be the right measure association. Essentially, is it an odds ratio or relative risk? And then which one is also statistically significant at a p-value of less than 0.05? So we haven't really gone through this yet, but I'm assuming many of you who are at this talk or this webinar have done some, have some research background. Okay, so the first thing I want you to think about, should it be an odds ratio or should it be a relative risk? That will help you figure out, is it A or B or C or D? And then which one can you tell is statistically significant? What do you need to see in the confidence interval for that? Okay, so the answer is B. Okay, so the reason is, when you do a logistic regression analysis, okay, the output that you always get is an odds ratio. You don't get a relative risk. If you have a logistic regression that you see in a study, your output is always going to be an odds ratio. Okay, so you're going to get whatever the odds ratio is. Then the key thing to figure out if that's statistically significant is to look at your confidence interval, right? If it crosses one, then it's not going to be statistically significant, right? So in the first example, A, you have an odds ratio, so it's the right output for a logistic regression, but the confidence interval crosses one, meaning between 0.8 and four, one is in between, so that's not going to be statistically significant. Okay, but in B, the confidence interval does not cross one. It's all higher than one, so that is going to be the one that's statistically significant. Okay, so we're moving back to the table, again, my sort of cheat sheet table, okay? And then we're going to talk a little bit about what do you do for sort of paired data, okay? Again, the key thing you're going to figure out, first you have to say, is this data paired? And then once you decide it is paired, you're going to look at the data and say, well, what kind of data do I have? Is it continuous and normally distributed, or should I use parametric statistics? Should I use non-parametric statistics? Again, because the data is skewed, or should I use, is my data categorical? Okay, so for continuous data, it's parity test. For non-parametric data, again, my sort of classic example is parity. It's the Wilcoxon signed rank, not to be confused with the Wilcoxon rank sum, which is what's also like the Mann-Whitney U. So what I tend to do is I use Mann-Whitney U, and I don't use the terminology Wilcoxon rank sum, because there's two then two Wilcoxons, which is confusing. So I use Mann-Whitney U, and then for this one, it would be Wilcoxon signed rank. And then for categorical data, it's the McNemars. Okay, so let's go through some examples here. So what test should be used for comparing PFIQ scores, let's say they range from zero to 300, before and after surgery in a fairly large study, 200 women? That's the first question. So before and after kind of signals pair data, and then I'm looking at PFIQ-7 scores that range from zero to 300. The second question is, how would you compare the proportion of women or the percent of women with fecal incontinence, either yes or no, before and after sort of a new drug that you're testing? Okay, so again, before and after, so it's paired data, it's the same group, but it's before and after, so they're kind of tied together, they're paired. And then the issue is the proportion of women with fecal incontinence, that's the variable that you're interested in. Okay, so again, the first one, I think you're getting probably more comfortable with saying, oh, yeah, well, that's continuous data. The scores range from zero to 300, okay, same interval between each of the scores, so that's continuous data. In a large study, so a lot of women, that's going to be continuous data, and so you're going to use a paired t-test. And then for number two, again, this is going to be categorical. Do the women have fecal incontinence, yes or no? That's categorical data, and then you're going to use the McNamara's test. All right, the next thing is what you're trying to figure out, is there an association between two different variables? And again, that's all going to be dependent on the kind of data that you have. Is it continuous? Is it, again, sort of non-parametric, sort of ordinal or skewed data? And then, or is it categorical, which the categorical isn't sort of this, like, correlation coefficient or variable, it's really just, you just do a chi-square. So there's either Pearson's or Spearman's correlation, and they range from minus one to one, the closer they are to minus one or one, the higher the correlation, the closer to zero, there's less correlation. So if the correlation coefficient is one, that means, you know, as one variable changes, it correlates to an increase in the other variable, okay? We don't tend to use correlation coefficients very often, but it's just good to kind of have a sense of that. Okay, and then I do want to mention, this is kind of in a different color for the table because it's a little different than sort of what's on the first table. We've actually gone through all the rows of the table, the cheat sheet table, I'll call it. But survival analysis comes up some, we don't tend to do a lot of it because we don't have a lot of great long-term follow-up data, although it would be great to have more, but it's important to kind of know about how you would describe sort of variables over time, okay? So I call it survival time, or it's just basically an analysis that incorporates time, right? If you have one group, you might just kind of plot out or just have the survival curve, because it's called the Kaplan-Meier survival curve. If you have two groups that you're comparing, you might have sort of two of these Kaplan-Meier survival curves, and then you'll compare the curves with a log-rank test. And then that other kind of regression analysis we mentioned before, it's called a Cox proportional hazards regression. The output is something called a hazards ratio, and the way I think about it, it's almost like a logistic regression analysis that incorporates time. So it's kind of a, it's like a regression analysis that incorporates a time component, and then allows you again to adjust for other variables in looking at your outcome. So a lot of these analysis are sort of time to event, something that incorporates time. And let's go through another sort of image of what that looks like, okay? So again, you guys have probably seen these Kaplan-Meier curves. They're probably really common in sort of the Gainan literature, right? Where it's sort of the outcome might be sort of death, and that's why they're kind of called survival curves. So when you look at these curves, so there's a blue line and a red line, or blue and a red curve, okay? You can see these step downs. So the step downs are an event. And oftentimes, again, because these are survival curves, each one of the step downs is sort of an event like death, okay? So every time you see a step, that's death, and then another death. And then the tiny little hash marks, I kind of put a little box in there called the vertical lines, that's sort of censored data, okay? So that's when you kind of lose them to follow up. So if you see a survival curve that has a lot of these vertical lines, it means you lost a lot of people to follow up, because you're not sure what happened to them. Ideally, you don't want to see a lot of these little sort of vertical lines, meaning you have good follow up, okay? And I've got these two curves, and then let's say I want to compare, well, are they different? Like I can't tell if they're different or not by eyeballing it. And so you can do a log rank test, compare the two survival curves. And again, typically, we'd say a p-value less than 0.05 is going to be statistically significant. And then again, remember, if you see cost proportional hazards model or regression, that's again, a regression model that incorporates time, but the output for that is a hazards ratio or HR, okay? I'm just paying attention to time here. All right. Okay. So this is sort of, we're going to go through more questions. I hope, I feel like sometimes when you do this, the more questions you have to practice with, the more the sort of the information sinks in. So again, if you were going to remember one thing or print up one slide, this would be the slide, okay? Your cheat sheet to figure out what would be the right test that I should use to analyze my data. And we're going to kind of use this as we go through the rest of the questions. Okay. So now we're going to go to a Q&A session. Again, I'm sorry, I can't see your faces to be able to sort of interact with you. Okay. So I'm going to read these questions. I have first question. 300 patients with overactive bladder were randomized to one of three groups, drug A, B, or placebo for 12 weeks. At 12 weeks, the number of urgency urinary incontinence episodes per day in each group was the following. And then at 12 weeks, we also assess the proportion of patients with dry mouth. And so you can see the percentages here for each of the drugs and placebo. The first question is, which of the following tests should be used to compare the average number of UUI episodes per day at 12 weeks between the three groups? Okay. I wish I could ask you all what you're thinking. I think you're all thinking ANOVA. And if you did say that, that's correct. Okay. Right. So we know that we're thinking urgent incontinence episodes per day, that's going to be sort of a continuous variable. You could see the, you know, sort of the data above, it has sort of a mean plus or minus standard deviation. And then we've got three groups. So we're going to use an ANOVA for that analysis. All right. Question number two. Same study, three groups. But now the question is, which of the following tests should be used to compare the proportion of patients with dry mouth in each group? Answer number two is chi-square, right? So this is just the proportion. You either have dry mouth or you do not. It's a yes, no categorical variable. You have three groups. And so the answer is chi-square. Right. Even if you just had two groups, drug A and drug B, and you were trying to, you know, compare dry mouth, it would still be a chi-square. So it doesn't matter if it's two groups or three groups for categorical data, it's chi-square. Okay, great. So again, to remind you about the table, we're now in the three group row. We know that, for example, urgent incontinence episodes per day was continuous data sets in ANOVA. Again, we're comparing percent of dry mouth, that's chi-square. If I were to say, how would you compare parity across these three groups? Okay, that would likely be, again, Kruskal-Wallis, because that data is skewed and we have three groups. So you kind of know, go down that non-parametric column, Kruskal-Wallis. Okay, question number three. In a prospective cohort study comparing a new stress incontinence surgery to the mid-urethra sling, the investigators want to assess if the two cohorts are similar or not. What tests should be used to compare age, parity, smoking status, either yes or no, and proportion on prednisone? And I gave you a hint, it's a rare characteristic. Okay, so I want you to think about first age. Again, so you're thinking, what kind of data is that? How many groups do I have? And then you should have your answer. Parity, my classic example of something. Smoking status. Okay, and if you can write these down just to commit yourself to an answer, yes or no. And then what's that other test when something's rare for prednisone proportion? Okay, so let's go through these answers. Answer three. Okay, so we've done age a bunch of times, continuous data, two groups, student's t-test. That's going to be one of the most common statistical tests you're going to use. Parity, again, my classic example, skewed data, because you don't have mean number of babies, you have sort of median, right? So May and Whitney U. Smoking status is a yes or no, so it's categorical, so it's going to be chi-square. And then prednisone use, my clue was, right, it's rare. So anything that's kind of rare, you're kind of thinking Fisher's exact. Now, again, you won't always know, oh, I'm going to do a Fisher exact. If you are doing your statistical software, you'll do, you know, you'll put in your data, it'll usually, it'll tell you, should it be a chi-square or should it be a Fisher's exact, based on, again, how the expected number that would happen in that sort of two by two cell, again, if you're kind of drawing it out. But usually your statistical program will tell you that, but you want to be thinking, oh, I might need to be thinking about the Fisher's exact results because it's a rare variable that I'm looking at. Okay, so again, we've done a lot of examples. These are the most common things that you're going to do. You're going to do a study comparing two groups, so this is the main row on the cheat sheet table that you're going to use. So you probably feel a lot more comfortable, I hope, with how are you going to describe your data, should you use mean or median, what do you do if you have two groups, and how are you going to figure that out, what's the right test to use, and then what do you do if you have three groups? Okay, the questions do kind of get harder as we go along. Okay, so, and we'll cover some topics that maybe I haven't spoken to, but you probably know the answers to. Okay, let's say you're interested in studying whether route of hysterectomy, laparoscopy versus vaginal, is associated with vaginal cuff dehiscence. What would be the best study design to address this question? Okay, should you randomize people into laparoscopy or vaginal and see who gets cuff dehiscence, or should you just kind of follow people who get a laparoscopic hysterectomy and vaginal hysterectomy and see what happens? Should you do a retrospective cohort, you look sort of back in time, you find all the laparoscopic hysterectomy and all the vaginal hysterectomy, and you see what happens, or should you do a case control study where you start with vaginal cuff dehiscence and figure out did they have a laparoscopic hysterectomy or a vaginal hysterectomy? Okay, what do you guys think? Okay, so we have not covered study design in this talk, but this would be sort of a classic example of what you would do a case control study, when you would do a case control study. Okay, so the classic example out there was mesothelioma. It was a really rare disease, so what you start with is you start with finding people who have the rare outcome and you look back to figure out what the exposure was that you're interested in. Okay, so in this case, the clue should have been vaginal cuff dehiscence. They don't happen that often, right? So if you're going to follow, if you're going to randomize people to laparoscopy and vag hysterectomy or follow those people over time, can you imagine the number of patients you'd have to follow to find cuff dehiscence? It's just not that common. So when you have a really rare outcome, you want to start with the outcome, because then at least you would have maybe 10 cases of cuff dehiscence in many, many years, and then you want to see your exposure of interest in this case is route of hysterectomy, and then you want to say, well, did they happen more with laparoscopic hysterectomy or vag hysterectomy? Okay, so my clue is something that's rare should be a case control study. Otherwise, you're going to have to recruit or follow so many people. It would be cost prohibitive to do this because you'd have to follow so many people to find those cuff dehiscences. Okay, so then the corollary to that, the follow-up question is for this case control study where we're looking for the risk of cuff dehiscence between laparoscopic and vaginal hysterectomy, what statement or statements are true regarding the analysis? Okay, so this requires you to think about, okay, we're doing, I'm telling you we're doing a case control study. What do you think the right kind of analysis should be for a case control study? And if we were to do a regression because I want to control for some other variables, what kind of regression would that be? I'm going to give you a little more time for this one because there could be one right answer or more than one right answer. Okay, let's go through the answer. Okay, so when you're doing a case control study, you do do a logistic regression if you're trying to adjust for some other confounders. Okay, so again, and the main reason that we're doing, it would be logistic regression is because the outcome is going to be cuff dehiscence, yes or no. Remember, so the key thing to figure out what kind of regression is going to be, what is the outcome that we're interested in? It's a yes or no kind of dichotomous outcome. We're going to do logistic regression. And then when you do a logistic regression, the output is always an odds ratio. So automatically, once you figure out, okay, yes, it's a logistic regression, then the odds ratio is going to be the outcome that you get. You don't get a relative risk from a logistic regression, you get an odds ratio. And then you can control for any of the confounders that you might be interested in. Okay, so again, logistic regression, because the outcome is a yes or no, yes, you had a cuff dehiscence or no, you did not have a cuff dehiscence outcome. Okay, and just a moment about relative risk and odds ratios, since I've talked about it a little bit. Typically, if you're doing a cohort study or a randomized trial, you can come up with a relative risk. These are often sort of easier to interpret, because you could say the risk of the exposed is higher, the relative risk of the exposed is this compared to the unexposed. However, if you're doing an odds ratio, it's hard to always kind of say like this person has 3.2 times the odds of getting this outcome than the sort of control group. But saying the word odds is sometimes harder to conceptualize than risk. Just to remind yourselves, like if you do case control study, you get an odds ratio, you would do a logistic regression. And also, if you do any logistic regression for any kind of study, because you have an outcome that's dichotomous, yes or no, the output will be an odds ratio. The key thing is that the odds ratio will be similar to the relative risk, and we're always trying to get to the risk of something. And that works pretty well, the odds ratio and the relative risk are more similar when the outcome is not that common. If you have a really common outcome, then your odds ratio will be a lot higher than the actual relative risk. And so you just want to kind of be aware of that. Okay. I've given other talks that I go into a lot more detail and give examples about sort of why this occurs and when this happens. But the whole point is that if you do it, if you're doing a case control study, think it's a rare outcome. And then in that case, the odds ratio will be very similar to the relative risk. So that works pretty well, okay? But again, anytime you do logistic regression, you will always get an odds ratio as your output. All right, question six. What measures of association are significant to a p-value of less than 0.05? There are a lot of options. So I'm gonna give you some time to think about this. And we still have 15 minutes, so we're good. I just have a few more questions, and then I'll give you a chance to ask me questions. So again, if you can write down the answers and just commit yourself, that'll, I think, always be helpful. Okay, so the responses that are highlighted in yellow are the correct answers. So let's go through these, okay? So the goal is when we're thinking about something that's significant to a p of less than 0.05, you just do not want the confidence interval to cross one. Okay, so number B or letter B is correct because if you look at the confidence interval, so the odds ratio is 0.5. But if you look at the relative risk, the odds ratio is 0.5. So if you look at the relative risk, look at the confidence interval. So the odds ratio is 0.5, but the confidence interval does not cross one. It goes from 0.1 to 0.9, and so that's significant. C is also correct because that relative risk is 1.1. Even though the confidence interval is just a little bit over one, it doesn't, you know, sort of go to like 0.9. For example, it's 1.01 to 1.24, so that's also correct. D is not correct, right? Because you see the confidence interval is 0.1 and it crosses over one to get to 10. And similar to A, again, that confidence interval incorporates one in it. So that would not be statistically significant to a p value of less than 0.05. And then E, remember that's that hazard ratio. So that's for a cost proportional hazards model, a regression analysis that you would think about if you're trying to incorporate time and you want to adjust for other factors. So that's an HR, a hazard ratio. And that's six. But again, if you look at the confidence interval, it's wider, right? It's kind of expands more. It's 1.3 to 13, but again, doesn't incorporate one, does not cross one. That's also statistically significant. And so a lot of times when you are reporting your sort of measure of association, your odds ratio, your relative risk or hazards ratio, you don't need the p value because we can tell if it's significant just based on looking at the confidence. So a lot of times, again, you would just report the confidence interval with the odds ratio or the relative risk or hazards ratio. But just a bit about confidence intervals. This tells you about sort of the reliability of an estimate. And so again, when you have the 95% confidence interval, what does that even mean, right? Well, it means that you're 95% certain that the true value of whatever that variable is, that parameter that you're looking at is sits within that confidence interval, okay? So you're 95% certain that the value that's correct is within that confidence interval, okay? So what you want is a really narrow or sort of a tight confidence interval versus something that's really wide. And that gets to precision, okay? So the confidence interval helps you describe the precision of the estimates. The more narrow the confidence interval, the more precise, okay? So I gave you two examples. Both of them have an odds ratio of 4.5, which sounds like, wow, that factor really increases the odds of whatever the outcome is that you're looking at. But if you were to sort of look at the confidence intervals, they're very different, right? The first confidence interval is very wide. It's not very precise. So you can be 95% certain that that odds ratio is between basically 1, 1.1 to 20, right? So that's not as helpful in a way or not as ideal, it's not as precise, versus if you look at the confidence interval below, that confidence interval is between 3.5 and 5.6. So I can be 95.5% certain that the actual odds ratio lies between 3.5 and 5.6. That makes me, that gives me more confidence, right? And sort of what that odds ratio is telling me, versus the top odds ratio that's the same, but the confidence interval is so wide. I mean, the odds ratio could be 1.1. So that really isn't a lot of increased risk for that variable that I'm looking at. Okay, so that's kind of what you want to see when you're looking at the confidence interval. You don't need a p-value, you can tell which ones are statistically significant, but the other thing that you're looking for is how sort of wide or narrow the confidence interval is to get a sense of precision. Okay, question number seven. Okay, so we're going to get a little bit more into sample size. So I didn't have a lot of time to talk through that either, but this is another practical thing that you need to figure out if you're going to design your research study, right? So you want to figure out, one, how do you present your data that you have? Two, how do you analyze the data that you have? But this is kind of stepping at the initial phases when you're trying to plan your study, you're like, gosh, how many people do I need to recruit for the study? So let's talk about sample size. Okay, so women with two or greater stage two or stage two or higher anterior vaginal prolapse are randomized to either an anterior repair versus this new surgery that you're going to do, okay? Let's say the cure rate for an anterior repair at one year is 60%. You think that the new surgery is better and that the cure rate at one year will be 80%. Okay, wouldn't that be great if we had something better than anterior repair? Anyway, okay, what additional data are needed to calculate a sample size for your study? Do you need, what are these variables you need? Okay, an alpha, power, standard deviation. You can pick the options of alpha plus power, so A and B, or do I need all of them? Do I need alpha, power, and standard deviation? So again, for those of you who've already kind of designed your study before or done some sample size estimates, this might be more familiar. Certainly, I think this, for those of you studying for boards, this could certainly come up. So you want to also understand kind of what do you need, do you need to figure out sample size? All right, answer seven. So you need an alpha and you need the power. Okay, you only need standard deviation if the outcome that you're looking at is a continuous variable, right? What we're looking at is percent sort of cure, 60 versus 80%. It's not a continuous variable. It's just a proportion. So you don't need standard deviation, but you need your alpha and you need your power. Let's talk a little bit. I think this is giving you an example of kind of how you would phrase it. So sometimes, you know, when I'm like reviewing manuscripts, I want to be able to really understand how did they come up with their sample size? So I look for wording that makes sense to me, okay? So I kind of put in here some wording. So in order to detect a difference between a cure rate of 60% for an anterior pair and 80% for the new surgery, we would need 91 women per group at an alpha of 0.05 and power of 0.8, okay? And I put in here, hopefully many of you know EpiInfo or OpenEpi. It's from the CDC. It allows you to kind of do some sample size estimates or other websites are available, but this is a nice one to sort of go to. And I put that sort of also on the reference page as well. Okay, so let's talk a little bit about sample size in this alpha and beta or power, because this also really comes up as well if you're kind of taking any kind of testing. Okay, so type one error or your alpha is the probability of incorrectly rejecting the null hypothesis. And I must say, like, sometimes I have to like, maybe you kind of have to, like, if you hear some terms, like definitions, some things are going to resonate for some people, but in others, like, how can you remember that? Is it, do you remember type one? Do you remember alpha? I kind of put in other ways that I think helped me. So I think this is helpful to think about it as the probability to detect a difference when there really is not one. Okay, so that's not great. You don't want to do that a lot. You don't want to say, oh, yes, there's a difference between these two when there really is not a difference. That's not a great error to make, okay? Some people call this also a false positive. That means you're saying, gosh, there's a difference, but there really is not one. This is false, so it's a false positive. So you want the alpha to be low. You don't want to state that there's a difference if one doesn't exist. That's kind of a bad mistake to make. So that's why our alphas are pretty low. It's usually 0.05. If you go even lower, right, you're going to have to have a really, a bigger sample size to detect that. That might be hard in terms of logistics. We typically will set our alpha at 0.05. All right, type two error or beta. This is the probability of not detecting a difference when there is one, okay? So you can also call that your false negative. You state that there's no difference, but it's false. There is a difference, okay? So often we'll set our beta, often it's at like 0.2, which actually, and then power is one minus beta, so it's 0.8. The beta is often higher, because it's not as bad to have a type two error than it is to have a type one error. Okay, again, so type one or alpha is saying there's a difference when there is not one, but the beta is saying that there is a difference, there's not a difference when there actually is one. Okay, so it's, again, hard to always remember these two, but again, if you think about it, alpha is like the worst mistake to make, so you kind of have a very low chance of an alpha error, like that type one error. And then beta, sort of that power issue, it's sort of not as bad to have a beta error in a way, right, or type two error. So you can have a little bit of a higher percentage for that and that's kind of, again, the things that you want to think about when you are thinking about your sample size. Okay, so we do have a couple more questions and we're running out of time, so I do want to get through this. This is more about sample size. Ideally, a study should be large enough to have a high probability of power, so power is one minus beta, of detecting a statistically significant and clinically important difference. So when you're thinking about power, you want to know what are your estimated outcomes in your group, what's the alpha? Again, often 0.05. What's the power? So power, again, is one minus beta. So beta is 0.2, then your power is 0.8. And then if it's a continuous outcome, you need your standard deviation. Okay, so in that first example, we have anterior pair 60% cure, knee surgery, 80% cure. We said alpha was 0.05, power was 0.8. We said we're going to have the same number, one-to-one ratio of recruitment of the anterior pair to the knee surgery. So we found out in the program that we use, you need 91 subjects per group. Okay, question eight. And then Dr. Harvey, tell me if it's, because I know we're supposed to get eight, and I don't know if people want me to continue with questions, because I can also send this out as well. The next example, I'll just say, is you're comparing, again, anterior pair to knee surgery. Your outcome is quality of life. So it's a scale of zero to 100. So it's sort of a continuous outcome. I'm telling you the minimally important difference is 10. Prior data shows that the mean quality of life at three months after an anterior pair is 70. What additional data are needed to calculate a sample size to detect a minimally important difference? I'm going to move this along quickly. So you would need, the difference in this, when you always need alpha, you always need power. But you need standard deviation because your outcome is continuous. And then I kind of give some wording for that. So again, I think that we can, I'm happy to share slides, but I do want to get to this last question. In the prior example, again, we said that there are 91 people who are comparing anterior pair versus knee surgery. I want you to think through what would happen if you made these changes in sort of the different variables here. So if I wanted more power from 80 to 90%, does the sample size increase or decrease? If I want the alpha to go from 0.05 to 0.01, and we keep the power the same, would the sample size increase or decrease? And then let's say that I'm trying to check the difference between the anterior pair cure rate being higher, 70%, not 60%. And I keep alpha and power the same. Would the sample size increase or decrease? And I'm quickly going to go through the answers. Okay, so I also put in what the numbers were. So if you want more power to detect that difference, you're going to need more patients, right? So instead of 91, you need 119 per group. If you want, again, the alpha to go down, the type one error to go down, you're going to need more people, 131. And then if you're trying to detect a smaller difference between the two groups, well, look at the change. Instead of 91 per group, you need 313. And so when you're planning your study, you want to kind of get a sense of usually alpha, the typical is 0.05, power is 0.8. And then you want to get a sense of what's a feasible study, but what also makes sense to you. Okay, quick recap. These were the objectives. I hope you understand different types of data. We talked about a commonly used statistics, and now hopefully the better sense of what tests you should use to analyze your own data. Remember, these are the things you want to think about when you're picking a test. Type of data, parametric versus non-parametric statistics, how many groups, and are your data paired or not? And then our cheat sheet table, that should help get you a lot of the way through sort of your data analysis. Although again, there's always nuances and there's so much to know, and I have, you know, reliable people that I consult to. Okay, and then some references. You know, there's the first two articles, or someone actually gave them to me, Matt Barber sent them to me, and they were these awesome articles for statistics for non-statisticians in a kind of not well-known journal, but they're good articles. I put in the epi info thing for sample size, and then this SJS research handbook that was from 2011, because statistics don't really change that much. Okay, should I stop sharing now? I don't know if that will help me see any questions. I don't think so. I'll read the questions to you. Okay. So thank you, Dr. Wu, for your presentation, and we have sort of one minute, and I'm just gonna go over the questions that we have right now in the Q&A. The first one is, what do you think is the best or highest yield statistical software for a fellow to become acquainted with? That's a great question. I think some of this depends on like what your mentors use, right? So if you're a fellow and like your mentors have expertise in one or another, they're gonna be able to hopefully teach you how to do that. I might say in our fellowship program, we tend to teach SPSS, but I also kind of have done some training in Stata. So I'm a Stata SPS person. Other programs are SAS or sort of like a SAS, sort of like light version, like Jump. I think some people are doing R. So I think it depends on, I would go to figuring out what your mentors use, or if your mentors don't do analyses, what the statisticians you work with do. Oftentimes, sometimes different programs will have, you know, resources outside of their division or even the department. Like at UNC, there's this institute that helps you with statistics, and they've even given our fellows in the past, like some sort of short courses on SPSS. So I think SPSS is much more user-friendly because you don't have to type in code like you would for Stata or SAS. That's sort of a quick answer. The other thing I would say, there are institutional licenses. So you get something for free and it's supported. So check with your institution. Yeah, like ours is like for SPSS is like $100 a year, you know, so you don't have to buy the whole package. It would be crazy expensive. A question on, can you kindly explain the P-value and how to calculate it? So I would say, you know, oftentimes we're like not calculating the P-value. Your statistical software is calculating the P-value, right? Like I think if you kind of went and did a hardcore biostats course, you probably could figure out how to do that. But in practicality, you would never really have to calculate the P-value. I would say the things that's helpful is, again, when you're looking at that confidence interval, if it doesn't cross one, you know that that sort of odds ratio and confidence interval that outcome is statistically significant to a P-value of less than 0.05. But then usually you're kind of relying on your statistical software to output the P-value for you. Like you're not really sitting there calculating it yourself. You have a general- Do you want to make anything else to add to that, Dr. Howell? No, that's- Like we're all gonna just look at our statistical software. Yeah, exactly. Do you have a general cutoff for when you consider something to be a rare event to choose a case control study design? Yeah, that's a great question. I would say definitely you want it to be less than 10% as your outcome. If it gets higher than that, again, you kind of run into this weird thing where the odds ratio will always exaggerate your relative risk and you're kind of looking for the risk. So I'd say something less than 10%. So, you know, again, it's always gonna be easier for you to practically do a study that's got a rare outcome. So, you know, I use Cuff Distance as an example, but sepsis, you know, DVT, anything that's kind of rare, even like mesh exposure is relatively rare, right? It's gonna be easier to find those cases of, oh, this person has this rare outcome, this group doesn't. Let me look back and see who has that exposure because otherwise it would just be feasibly, it would take a really, it would take a large number of patients to recruit to find that outcome that's rare, right? So a lot of it's sort of the practical approach, what's doable. But if you had unlimited funds, I mean, certainly you could do something prospective, which always gives you all the data that you want and the way you want it. But again, I would say definitely less than 10% as your outcome, you know, would be a good sort of, you know, guideline, but the rarer the, you know, kind of then definitely you wanna think about case control. Question, when the OR shows a confidence interval that's broad, would that become a limitation? You know, it won't, like, let's say I'm reviewing a paper and I see like an odds ratio goes from one to 30 or one to 20, or even one to 15, like 1.1 to 15. I'm like, what do I take away? Like, what's that message, right? Because it's not a really precise, like how confident am I that the odds ratio is really eight, that this, that, you know, that, you know, having diabetes really increases your risk of having prolapse recurrence after surgery. Is the odds ratio really eight? And if a confidence interval is like 1.05 to 20, you know, it just makes me like think a little more about, oh, it's just not as sort of as convincing in a way. So ideally, you know, it does kind of make you think a little bit about it, right? So even sort of, again, we rarely see really high odds ratios. And when you see confidence intervals with something that's an odds ratio of let's say 15, 20, we don't really find a lot of variables that increase your odds by that much in real life. It's kind of uncommon. So I just think, you know, but it is what it is, right? If you have a small study that doesn't have a lot of patients in it, you might not be able to get a really precise estimate. Similarly, if you have a really large study, you're gonna have really precise estimates because just the sheer number of patients that you have. But it is what your data is, right? So you can't do much about your data. One last question. Do you have recommendations for tutorials for SPSS? You know, I don't know if anyone, I mean, people, if you guys can chat, maybe you can share with each other. We, you know, like I will say, there are a lot of tutorials that you can get when you get SPSS and there's sort of like tutorials on there. And there's sometimes there are things that I'm not sure how to do. And I just literally just like Google it and be like, SPSS, how do I do this? You can just like look up things, right? But sort of from like a nuts and bolts practical, I don't know of one. You know, we again at UNC, we have this institute where they have sort of ways that you can learn. I suspect if you kind of look online, maybe there could be something. But ideally like looking at your institution to see what resources are, but I don't know of one just off the top of your head where you could just sort of sign up to get an SPSS tutorial. I remember a long time ago when I was a fellow, we had someone kind of do that with us for like fellows education, which was pretty great. But I don't know if anyone else has any ideas. I'm sorry, I can't. I don't think I can see chat or people. Yeah, I can't see it. So maybe if people have some resources that they know of, please, you know, I don't know, can they email you so that you can disseminate that? So people know and I'm happy to disseminate the slides too. People can have. So the slides are gonna be put up, posted on the website by tomorrow or by the end of tomorrow. Even better. So lots and lots of comments of thank you and fabulous lecture. And so on behalf of Augs Education Committee, I'd like to thank Dr. Wu and everyone for joining us today. Next webinar is entitled Voiding Dysfunction After Stress Incontinence Surgery and will be presented by Dr. Steven Krause and Dr. David Austin on December 16th. Well, thank you guys all for your time and attention. It's always hard to make statistics interesting. So I hope you all learned something and it was great to spend the night with you, even though I didn't get to see any of your faces, but thanks for attending. I really appreciate it. Thank you. Okay, bye everybody. Thank you so much.
Video Summary
In these two summaries, both videos feature Dr. Jennifer Wu discussing statistical concepts and their application in research studies. The first video is a webinar where Dr. Wu covers different types of research data and explores the appropriate statistical tests for each type. She also discusses regression analysis and concludes with a Q&A session. The second video provides a comprehensive overview of statistics, including different types of variables, commonly used statistical tests, sample size calculations, interpreting confidence intervals and p-values, and understanding type I and type II errors. No credits were mentioned in either video. Dr. Jennifer Wu, an experienced clinician and researcher in obstetrics and gynecology with a background in epidemiology and public health, presents both videos.
Asset Subtitle
Jennifer M. Wu, MD, MPH
Keywords
Dr. Jennifer Wu
statistical concepts
research studies
types of research data
statistical tests
regression analysis
variables
confidence intervals
p-values
type I errors
×
Please select your language
1
English