RSS 2.0 Feed

» Welcome Guest Log In :: Register

    
  Topic: mafs question< Next Oldest | Next Newest >  
Richardthughes



Posts: 11178
Joined: Jan. 2006

(Permalink) Posted: Mar. 10 2011,11:59   

for a freind, actually:

"I’ve got one population that is a sub-set of another. I know the counts in both, the means and the standard deviations. What stat test can I use to teel if the mean of the sub-set is statistically different from the known mean of the whole? I think a t-test assumes two independent sets of data which these are not."

Halp!

--------------
"Richardthughes, you magnificent bastard, I stand in awe of you..." : Arden Chatfield
"You magnificent bastard! " : Louis
"ATBC poster child", "I have to agree with Rich.." : DaveTard
"I bow to your superior skills" : deadman_932
"...it was Richardthughes making me lie in bed.." : Kristine

  
Henry J



Posts: 5786
Joined: Mar. 2005

(Permalink) Posted: Mar. 10 2011,12:15   

Maybe draw bell curves for both, and see if the subset curve straddles the mean of the whole, or is lumped on one side of it?

  
Richardthughes



Posts: 11178
Joined: Jan. 2006

(Permalink) Posted: Mar. 10 2011,12:23   

I need an actual statistical test, probablywith confidence and what not.

--------------
"Richardthughes, you magnificent bastard, I stand in awe of you..." : Arden Chatfield
"You magnificent bastard! " : Louis
"ATBC poster child", "I have to agree with Rich.." : DaveTard
"I bow to your superior skills" : deadman_932
"...it was Richardthughes making me lie in bed.." : Kristine

  
Henry J



Posts: 5786
Joined: Mar. 2005

(Permalink) Posted: Mar. 10 2011,12:29   

Oh. Well, I'm confident that I don't have one of those.

  
Erasmus, FCD



Posts: 6349
Joined: June 2007

(Permalink) Posted: Mar. 10 2011,12:36   

what does he mean by "population is a subset of the other"?  counts and means of what

--------------
You're obviously illiterate as hell. Peach, bro.-FtK

Finding something hard to believe based on the evidence, is science.-JoeG

the odds of getting some loathsome taint are low-- Gordon E Mullings Manjack Heights Montserrat

I work on molecular systems with pathway charts and such.-Giggles

  
Richardthughes



Posts: 11178
Joined: Jan. 2006

(Permalink) Posted: Mar. 10 2011,12:40   

say there's 100 (count) thingies with a mean of 5 and a standard deviation of 3. He's pulled out 10 (count) of those (a subset) with a mean of 4.5 and a standard deviation of 2.

*made up example/ edited & enhanced for moar better clarity and diplomatic relations.

--------------
"Richardthughes, you magnificent bastard, I stand in awe of you..." : Arden Chatfield
"You magnificent bastard! " : Louis
"ATBC poster child", "I have to agree with Rich.." : DaveTard
"I bow to your superior skills" : deadman_932
"...it was Richardthughes making me lie in bed.." : Kristine

  
Schroedinger's Dog



Posts: 1692
Joined: Jan. 2009

(Permalink) Posted: Mar. 10 2011,14:35   

Rich: could you correct spelling and such so that we non-US can have a clear picture of what you need?

:)


Damn; I feel like a grammar nazi today!

--------------
"Hail is made out of water? Are you really that stupid?" Joe G

"I have a better suggestion, Kris. How about a game of hide and go fuck yourself instead." Louis

"The reason people use a crucifix against vampires is that vampires are allergic to bullshit" Richard Pryor

   
Richardthughes



Posts: 11178
Joined: Jan. 2006

(Permalink) Posted: Mar. 10 2011,15:21   

Zut alors! Done.

--------------
"Richardthughes, you magnificent bastard, I stand in awe of you..." : Arden Chatfield
"You magnificent bastard! " : Louis
"ATBC poster child", "I have to agree with Rich.." : DaveTard
"I bow to your superior skills" : deadman_932
"...it was Richardthughes making me lie in bed.." : Kristine

  
Dr.GH



Posts: 2333
Joined: May 2002

(Permalink) Posted: Mar. 10 2011,18:37   

If I understand your question, T-test. Delete the subsample from the initial sample.

--------------
"Science is the horse that pulls the cart of philosophy."

L. Susskind, 2004 "SMOLIN VS. SUSSKIND: THE ANTHROPIC PRINCIPLE"

   
Richardthughes



Posts: 11178
Joined: Jan. 2006

(Permalink) Posted: Mar. 10 2011,18:39   

Quote (Dr.GH @ Mar. 10 2011,18:37)
If I understand your question, T-test. Delete the subsample from the initial sample.

Thank you sir. Would Anova make any sense?

--------------
"Richardthughes, you magnificent bastard, I stand in awe of you..." : Arden Chatfield
"You magnificent bastard! " : Louis
"ATBC poster child", "I have to agree with Rich.." : DaveTard
"I bow to your superior skills" : deadman_932
"...it was Richardthughes making me lie in bed.." : Kristine

  
George



Posts: 316
Joined: Feb. 2006

(Permalink) Posted: Mar. 11 2011,01:34   

A two-sample ANOVA and a t-test are equivalent.

  
Raevmo



Posts: 235
Joined: Oct. 2006

(Permalink) Posted: Mar. 11 2011,05:37   

Quote (Richardthughes @ Mar. 10 2011,11:59)
for a freind, actually:

"I’ve got one population that is a sub-set of another. I know the counts in both, the means and the standard deviations. What stat test can I use to teel if the mean of the sub-set is statistically different from the known mean of the whole? I think a t-test assumes two independent sets of data which these are not."

Halp!

If the mean (say mu) of the total population is known (say mu=mu_0), and the "counts" in the total population are normally distributed (the word "counts" actually suggests that the data are count data. i.e. non-negative integers, rather than continuous normal data), then a one-sample t-test could be used to test whether the sample is from a population with mu=mu_0.

Alternatively, take bootstrap samples from the sample and see how far out  mu_0 is in the bootstrap distribution of the mean.

--------------
After much reflection I finally realized that the best way to describe the cause of the universe is: the great I AM.

--GilDodgen

  
Bob O'H



Posts: 2564
Joined: Oct. 2005

(Permalink) Posted: Mar. 11 2011,07:06   

A practical suggestion: if the subset is a small subset of the full data, don't worry and treat it as independent.

Means usually become normally distributed very quickly, so bog standard t- and z- tests are fine.

--------------
It is fun to dip into the various threads to watch cluelessness at work in the hands of the confident exponent. - Soapy Sam (so say we all)

   
George



Posts: 316
Joined: Feb. 2006

(Permalink) Posted: Mar. 11 2011,07:15   

Another thought: counts data are often Poisson distributed and not suitable for t-tests / ANOVA.  However, a simple transformation should work the trick.  IIRC, a square root transformation is often best for counts data.

  
Raevmo



Posts: 235
Joined: Oct. 2006

(Permalink) Posted: Mar. 11 2011,07:16   

The problem is a bit weird. Suppose it turns out that a t-test says it's very unlikely that the sample was taken from a population with mean mu=mu_0, i.e. has a very small p-value, even though we know for sure that the sample was taken from a population with mean mu_0. Then what? The only sensible conclusion then seems to be that the sampling procedure was "non-random" in some sense. Does that make sense, Oh Bob?

--------------
After much reflection I finally realized that the best way to describe the cause of the universe is: the great I AM.

--GilDodgen

  
Raevmo



Posts: 235
Joined: Oct. 2006

(Permalink) Posted: Mar. 11 2011,07:23   

Quote (George @ Mar. 11 2011,07:15)
Another thought: counts data are often Poisson distributed and not suitable for t-tests / ANOVA.  However, a simple transformation should work the trick.  IIRC, a square root transformation is often best for counts data.

Or a log-transformation. Or run a glm with a log-link and family=poisson option [R code: glm(counts~1,family=poisson,data=teh.sample] and test whether the intercept is significantly different from what's expected: |intercept - exp(mu_0)|/se(intercept)~N(0,1).

--------------
After much reflection I finally realized that the best way to describe the cause of the universe is: the great I AM.

--GilDodgen

  
Erasmus, FCD



Posts: 6349
Joined: June 2007

(Permalink) Posted: Mar. 11 2011,07:26   

DO NOT LOG TRANSFORM COUNT DATA HOMO

--------------
You're obviously illiterate as hell. Peach, bro.-FtK

Finding something hard to believe based on the evidence, is science.-JoeG

the odds of getting some loathsome taint are low-- Gordon E Mullings Manjack Heights Montserrat

I work on molecular systems with pathway charts and such.-Giggles

  
Schroedinger's Dog



Posts: 1692
Joined: Jan. 2009

(Permalink) Posted: Mar. 11 2011,07:32   

I just realised that the "mafs" in the topic heading was standing for "maths". I should have avoided this topic from the very start. Head is aching now...

--------------
"Hail is made out of water? Are you really that stupid?" Joe G

"I have a better suggestion, Kris. How about a game of hide and go fuck yourself instead." Louis

"The reason people use a crucifix against vampires is that vampires are allergic to bullshit" Richard Pryor

   
Raevmo



Posts: 235
Joined: Oct. 2006

(Permalink) Posted: Mar. 11 2011,07:43   

Quote (Erasmus @ FCD,Mar. 11 2011,07:26)
DO NOT LOG TRANSFORM COUNT DATA HOMO

Here in the modern day Sodom (the Netherlands) we consider that perfectly acceptable. (Admittedly, we sometimes add 1 to the counts to prevent plugging zero-counts into the log).

--------------
After much reflection I finally realized that the best way to describe the cause of the universe is: the great I AM.

--GilDodgen

  
Bob O'H



Posts: 2564
Joined: Oct. 2005

(Permalink) Posted: Mar. 11 2011,09:16   

Quote (Raevmo @ Mar. 11 2011,07:16)
The problem is a bit weird. Suppose it turns out that a t-test says it's very unlikely that the sample was taken from a population with mean mu=mu_0, i.e. has a very small p-value, even though we know for sure that the sample was taken from a population with mean mu_0. Then what? The only sensible conclusion then seems to be that the sampling procedure was "non-random" in some sense. Does that make sense, Oh Bob?

Yes, that makes sense.

Assuming there's a decent amount of data, I wouldn't worry too much about the distribution: if you've got the means and standard errors, you'll be fine. If I had the original data, I would have used a GLM. But if the original data were to hand, we wouldn't have this problem!

I'll let Erasmus explain the sins of log-transformation, and how it relates to cricket.

--------------
It is fun to dip into the various threads to watch cluelessness at work in the hands of the confident exponent. - Soapy Sam (so say we all)

   
Richardthughes



Posts: 11178
Joined: Jan. 2006

(Permalink) Posted: Mar. 11 2011,10:22   

Thanks all, you has been moast halpfull.



Did you hear about when Bob'O was constipated? He worked in out with a pencil.  :p

--------------
"Richardthughes, you magnificent bastard, I stand in awe of you..." : Arden Chatfield
"You magnificent bastard! " : Louis
"ATBC poster child", "I have to agree with Rich.." : DaveTard
"I bow to your superior skills" : deadman_932
"...it was Richardthughes making me lie in bed.." : Kristine

  
Richardthughes



Posts: 11178
Joined: Jan. 2006

(Permalink) Posted: Mar. 11 2011,11:25   

Quote (Raevmo @ Mar. 11 2011,05:37)
Quote (Richardthughes @ Mar. 10 2011,11:59)
for a freind, actually:

"I’ve got one population that is a sub-set of another. I know the counts in both, the means and the standard deviations. What stat test can I use to teel if the mean of the sub-set is statistically different from the known mean of the whole? I think a t-test assumes two independent sets of data which these are not."

Halp!

If the mean (say mu) of the total population is known (say mu=mu_0), and the "counts" in the total population are normally distributed (the word "counts" actually suggests that the data are count data. i.e. non-negative integers, rather than continuous normal data), then a one-sample t-test could be used to test whether the sample is from a population with mu=mu_0.

Alternatively, take bootstrap samples from the sample and see how far out  mu_0 is in the bootstrap distribution of the mean.

Just to clarift above - counts are the population sizes. Oh pivot tables, you harsh mistress!

--------------
"Richardthughes, you magnificent bastard, I stand in awe of you..." : Arden Chatfield
"You magnificent bastard! " : Louis
"ATBC poster child", "I have to agree with Rich.." : DaveTard
"I bow to your superior skills" : deadman_932
"...it was Richardthughes making me lie in bed.." : Kristine

  
Richardthughes



Posts: 11178
Joined: Jan. 2006

(Permalink) Posted: Mar. 11 2011,11:25   

Quote (Raevmo @ Mar. 11 2011,05:37)
Quote (Richardthughes @ Mar. 10 2011,11:59)
for a freind, actually:

"I’ve got one population that is a sub-set of another. I know the counts in both, the means and the standard deviations. What stat test can I use to teel if the mean of the sub-set is statistically different from the known mean of the whole? I think a t-test assumes two independent sets of data which these are not."

Halp!

If the mean (say mu) of the total population is known (say mu=mu_0), and the "counts" in the total population are normally distributed (the word "counts" actually suggests that the data are count data. i.e. non-negative integers, rather than continuous normal data), then a one-sample t-test could be used to test whether the sample is from a population with mu=mu_0.

Alternatively, take bootstrap samples from the sample and see how far out  mu_0 is in the bootstrap distribution of the mean.

Just to clarift above - counts are the population sizes. Oh pivot tables, you harsh mistress!

--------------
"Richardthughes, you magnificent bastard, I stand in awe of you..." : Arden Chatfield
"You magnificent bastard! " : Louis
"ATBC poster child", "I have to agree with Rich.." : DaveTard
"I bow to your superior skills" : deadman_932
"...it was Richardthughes making me lie in bed.." : Kristine

  
Dr.GH



Posts: 2333
Joined: May 2002

(Permalink) Posted: Mar. 11 2011,11:50   

Quote (Richardthughes @ Mar. 10 2011,16:39)
Quote (Dr.GH @ Mar. 10 2011,18:37)
If I understand your question, T-test. Delete the subsample from the initial sample.

Thank you sir. Would Anova make any sense?

Not for such a small sample (10?).

--------------
"Science is the horse that pulls the cart of philosophy."

L. Susskind, 2004 "SMOLIN VS. SUSSKIND: THE ANTHROPIC PRINCIPLE"

   
Richardthughes



Posts: 11178
Joined: Jan. 2006

(Permalink) Posted: Mar. 11 2011,12:02   

Quote (Dr.GH @ Mar. 11 2011,11:50)
Quote (Richardthughes @ Mar. 10 2011,16:39)
Quote (Dr.GH @ Mar. 10 2011,18:37)
If I understand your question, T-test. Delete the subsample from the initial sample.

Thank you sir. Would Anova make any sense?

Not for such a small sample (10?).

Gotcha. I'm not sure how big his sample is (ohh-err) I was just trying to come up with an example.

--------------
"Richardthughes, you magnificent bastard, I stand in awe of you..." : Arden Chatfield
"You magnificent bastard! " : Louis
"ATBC poster child", "I have to agree with Rich.." : DaveTard
"I bow to your superior skills" : deadman_932
"...it was Richardthughes making me lie in bed.." : Kristine

  
Dr.GH



Posts: 2333
Joined: May 2002

(Permalink) Posted: Mar. 11 2011,12:08   

I am still not clear on what the real question is. For example, if the question is "Have I a large enough sample to accurately estimate the mean, and standard deviation of the population?" you might split the data set, and then calculate t, and f for each half, and compare them to the total sample's parameters. If the results indicated that the sub-samples were basically the same, then problem solved.

ETA: There is a statistic to test if one has sampled the SD of a population that is an application of Chebsyev's Theorem, but I am having a caffeine deficit disorder.

Edited by Dr.GH on Mar. 11 2011,10:13

--------------
"Science is the horse that pulls the cart of philosophy."

L. Susskind, 2004 "SMOLIN VS. SUSSKIND: THE ANTHROPIC PRINCIPLE"

   
fnxtr



Posts: 3504
Joined: June 2006

(Permalink) Posted: Mar. 11 2011,18:48   

Shit. Never mind. Wrong thread.

--------------
"[A] book said there were 5 trillion witnesses. Who am I supposed to believe, 5 trillion witnesses or you? That shit's, like, ironclad. " -- stevestory

"Wow, you must be retarded. I said that CO2 does not trap heat. If it did then it would not cool down at night."  Joe G

  
  26 replies since Mar. 10 2011,11:59 < Next Oldest | Next Newest >  

    


Track this topic Email this topic Print this topic

[ Read the Board Rules ] | [Useful Links] | [Evolving Designs]