Prepara tus exámenes
Consigue puntos
Orientación Universidad

Prepara tus exámenes

Prepara tus exámenes y mejora tus resultados gracias a la gran cantidad de recursos disponibles en Docsity

Consigue puntos base para descargar

Gana puntos ayudando a otros estudiantes o consíguelos activando un Plan Premium

Orientación Universidad

Vende en Docsity

Inicia sesión Regístrate

Prepara tus exámenes

Prepara tus exámenes y mejora tus resultados gracias a la gran cantidad de recursos disponibles en Docsity

Busca documentos

Prepara tus exámenes con los documentos que comparten otros estudiantes como tú en Docsity

Busca documentos en el Store

Los mejores documentos en venta realizados por estudiantes que han terminado sus estudios

Video Cursos

Estudia con lecciones y exámenes resueltos basados en los programas académicos de las mejores universidades

Quiz

Responde a preguntas de exámenes reales y pon a prueba tu preparación

Busca entre todos los recursos para el estudio

Docsity AINEW

Resume tus documentos, hazles preguntas, conviértelos en quiz y mapas conceptuales

Ver preguntas

Despeja tus dudas leyendo las respuestas a las preguntas que realizaron otros estudiantes como tú

Consigue puntos base para descargar

Gana puntos ayudando a otros estudiantes o consíguelos activando un Plan Premium

Compartir documentos

20 Puntos

Por cada documento subido

Responde a las preguntas

5 Puntos

por cada respuesta dada (máx. 1 al día)

Todos los modos para conseguir puntos gratis

Consigue puntos de inmediato

Elige un plan Premium con todos los puntos que necesitas.

Oportunidades de estudio

Elige tu próximo programa de estudio

Ponte en contacto inmediatamente con las mejores universidades del mundo. Busca entre miles de universidades en todo el mundo. Busca entre miles de universidades partner oficiales

Comunidad

Pregúntale a la comunidad

Pide ayuda a la comunidad y resuelve tus dudas de estudio

Ranking de las universidades

Descubre las mejores universidades de tu país según los usuarios de Docsity

Ebooks gratuitos

¡Nuestros e-books salva-estudiantes!

Descarga nuestras guías gratuitas sobre técnicas de estudio, métodos para controlar la ansiedad y consejos para la tesis preparadas por los tutores de Docsity

Del blog

Actualidad

Becas y ayuda

Ve al blog

Análisis estadístico de datos de torque de tapas y costo de energía familiar en R, Apuntes de Estadística

Universidad de las Américas Puebla (UDLAP)Estadística

El análisis estadístico de datos de torque de tapas y costo de energía familiar utilizando herramientas de análisis estadístico en r. El documento incluye el código r para la generación de gráficos y pruebas de hipótesis, así como los resultados obtenidos. El análisis de torque de tapas se realiza para dos máquinas diferentes y se evalúa la distribución, la dispersión y los valores centrales. El análisis de costo de energía familiar se realiza para 25 familias y se evalúa si ha cambiado con respecto al año anterior, cuando el costo promedio mensual era de $200.

Tipo: Apuntes

2023/2024

Subido el 18/03/2024

ubaldo-moran-paniagua 🇲🇽

1 documento

1 / 12

Esta página no es visible en la vista previa

¡No te pierdas las partes importantes!

Activity 2. Basic Statistics in R

Selected Topics 1

Spring 2022

The due date is February 25th at 11:59 pm.

Instructions: This activity can be done in pairs or thirds. Only one will submit the activity with the name of

both members. You must send a screenshot of the teams meeting with the video camera turned on—this

requirement is evidence of teamwork.

Names: Rebeca Paola Aguilar Jaimes- 159337

Ubaldo de Jesús Morán Paniagua 159580

For each problem, analyze the information using different statistical analysis tools in R to answer the

questions. For each used tool, you must provide specific conclusions. If you do not write your

conclusions according to each analysis, the solution does not count for your score.

Problem 1. Cap removal torque data

A quality control engineer needs to ensure that the caps on shampoo bottles are fastened

correctly. If the caps are fastened too loosely, they may fall off during shipping. If they

are fastened too tightly, they may be too difficult to remove. The target torque value for

fastening the caps is 18. The engineer collects a random sample of 68 bottles and tests the

amount of torque that is needed to remove the caps.

Documentos relacionados

Análisis estadístico de datos familiares

Análisis estadístico de datos de encuesta sobre características familiares

Análisis Estadístico de Datos Socioeconómicos: Energía en Zonas No Urbanas

Análisis estadístico de datos numéricos

Análisis estadístico de datos y pruebas de hipótesis

Análisis de la Demanda y Costos en el Sector de la Energía: Datos Históricos (2001-2015)

Costos y Ventas de una Pizzería: Análisis Estadístico

Análisis de gastos familiares: despesas con el cotxe y alimentación

Estadísticas: Datos de Terapias Familiares, Concentración y Relaciones Lineales

Análisis Descriptivo de Datos: Estadísticas Básicas de una Muestra

Análisis Estadístico de Datos de Hoteles y Autos: Cálculo de Medidas Descriptivas

Estrategia de Optimización de Costos: Análisis de Variables y Demanda

Vista previa parcial del texto

¡Descarga Análisis estadístico de datos de torque de tapas y costo de energía familiar en R y más Apuntes en PDF de Estadística solo en Docsity!

Activity 2. Basic Statistics in R

Selected Topics 1

Spring 2022

The due date is February 25th^ at 11:59 pm.

Instructions : This activity can be done in pairs or thirds. Only one will submit the activity with the name of both members. You must send a screenshot of the teams meeting with the video camera turned on—this requirement is evidence of teamwork. Names: Rebeca Paola Aguilar Jaimes- 159337 Ubaldo de Jesús Morán Paniagua 159580 For each problem, analyze the information using different statistical analysis tools in R to answer the questions. For each used tool, you must provide specific conclusions. If you do not write your conclusions according to each analysis, the solution does not count for your score.

Problem 1. Cap removal torque data

A quality control engineer needs to ensure that the caps on shampoo bottles are fastened correctly. If the caps are fastened too loosely, they may fall off during shipping. If they are fastened too tightly, they may be too difficult to remove. The target torque value for fastening the caps is 18. The engineer collects a random sample of 68 bottles and tests the amount of torque that is needed to remove the caps.

Column Description Torque The torque that is needed to remove the cap Machin e The machine that tightened the cap: 1 or 2 a) Use a boxplot grouping the torque in two categories, one for machine 1 and the other for machine 2. Use the results to make your conclusions about dispersion, distribution, central values, etc.

INPUT #First

CapTorque <- read.csv("~/R/DataSets/CapTorque.csv")

View(CapTorque)

machine_type1 = CapTorque$Torque[CapTorque$Machine == 1]

machine_type2 = CapTorque$Torque[CapTorque$Machine == 2]

boxplot(machine_type1,machine_type2,col=c("green","yellow"))

summary(machine_type1)

summary(machine_type2)

library(pastecs)

stat.desc(machine_type1)

stat.desc(machine_type2)

#Prueba de Hipótesis máquina 1

xbarra1=mean(machine_type1)

mu1=

sd1=sd(machine_type1)

n1=length(machine_type1)

z1=(xbarra1-mu1)/(sd1/sqrt(n1))

z

alpha1=(0.05)/

alpha

z.alpha1=qnorm(1-alpha1)

z.alpha

pval1=(1-pnorm(z1))*

pval

#Se concluye que la máquina 1 esta obteniendo el valor de torque objetivo para

las tapas

#Prueba de Hipótesis máquina 2

xbarra2=mean(machine_type2)

mu2=

sd2=sd(machine_type2)

n2=length(machine_type2)

z2=(xbarra2-mu2)/(sd2/sqrt(n2))

z

alpha2=(0.05)/

alpha

[1] 1.

> pval1=(1-pnorm(z1))*

> pval

[1] 0.

> #Se concluye que la máquina 1 esta obteniendo el valor de torque objetivo

para las tapas

> #Prueba de Hipótesis máquina 2

> xbarra2=mean(machine_type2)

> mu2=

> sd2=sd(machine_type2)

> n2=length(machine_type2)

> z2=(xbarra2-mu2)/(sd2/sqrt(n2))

> z

[1] 4.

> alpha2=(0.05)/

> alpha

[1] 0.

> z.alpha2=qnorm(1-alpha2)

> z.alpha

[1] 1.

> pval2=(1-pnorm(z2))*

> pval

[1] 8.788042e-

We can see that the Torque values for Machine 1 are not as spread out as the values for Machine 2. We can also see that the data for Machine 1 follow a normal distribution skewed to the left and the data for Machine 2 follow a normal distribution. an almost uniform distribution. According to the test statistic, assuming that it follows a normal distribution and constant variance for Machine 1 and 2, we find that Machine 1 adjusts to the objective value equivalent to 18 of the fixation of the covers, this is verified by having carried out a test two tails with a significance level of 0.05, obtaining a p value equivalent to 0.36. While for Machine 2 it obtained a very small p-value (8.788042e-07), since it is a value lower than the level of significance (alpha), it does not meet the objective value of fixation. b) Finally, determine the mean and the standard deviation of each machine's torque and make your final conclusions about the problem. You can use the predefined functions in R to make this analysis. Finally the Machine that has the least variability, an almost normal distribution (skewed to the right) and that is closer to the target of 18 to fasten the caps was number 1,

because its mean needed to remove the caps is 18.67 and its deviation is 4.394. Compared to Machine 2, which has a mean of 24.19 and a deviation of 7.11. So it can be concluded that their two means are different according to what can be seen in the graph and in the results. Problem 2. Family energy cost data An economist wants to determine whether the monthly energy cost for families has changed from the previous year, when the mean cost per month was $200. The economist randomly samples 25 families and records their energy costs for the current year. Data file : FamilyEnergyCost.csv Worksheet column Description Family ID The family identification number Energy Cost The mean cost of energy per month a) Use a histogram to evaluate if the energy cost follows a normal distribution function. INPUT #Second FamilyEnergyCost <- read.csv("~/R/DataSets/FamilyEnergyCost.csv") View(FamilyEnergyCost) hist(FamilyEnergyCost$Energy.Cost) #install.packages('nortest') library(nortest) ad.test(FamilyEnergyCost$Energy.Cost) #PRUEBA A.DARLING DE NORMALIDAD str(FamilyEnergyCost) Energy_Cost=FamilyEnergyCost$Energy.Cost hist(Energy_Cost)#Histograma Mean_Family_Energy=mean(Energy_Cost) #Medias de los costos de energía sd_energy_cost=sd(Energy_Cost) #Desviación estándar de los costos de energía summary(Energy_Cost)#Summary OUTPUT > #Second

FamilyEnergyCost <- read.csv("~/R/DataSets/FamilyEnergyCost.csv") View(FamilyEnergyCost) hist(FamilyEnergyCost$Energy.Cost) #install.packages('nortest') library(nortest) ad.test(FamilyEnergyCost$Energy.Cost) #PRUEBA A.DARLING DE NORMALIDAD Anderson-Darling normality test data: FamilyEnergyCost$Energy.Cost A = 0.22709, p-value = 0. str(FamilyEnergyCost) 'data.frame': 25 obs. of 2 variables: $ Family.ID : int 1 2 3 4 5 6 7 8 9 10 ... $ Energy.Cost: int 211 572 558 250 478 307 184 435 460 308 ...

After The resting heart rate of the person after the running program Difference The difference between the person's resting heart rate before and after the running program a) Evaluates with a 95% confidence interval mean plot if there is a difference between the resting heart rate of people before and after the running program. (15 points) INPUT #Third RestingHeartRate <- read.csv("~/R/DataSets/RestingHeartRate.csv") View(RestingHeartRate) RestingHeartRate=subset(RestingHeartRate,select=c(Before,After)) RestingHeartRate str(RestingHeartRate) means_RestingHeartRate = sapply(RestingHeartRate,mean) stdev_RestingHeartRate = sapply(RestingHeartRate,sd) n = sapply(RestingHeartRate,length) qt(0.975,n-1) interval = qt(0.975,n-1)*stdev_RestingHeartRate/sqrt(n) plotCI(x=means_RestingHeartRate, uiw =interval, barcol="blue", main=expression(paste("Confidence Interval Plot of difference between the heart rate of people before and after ",alpha,"=5%"))) t.test(RestingHeartRate$Before,RestingHeartRate$After,var.equal = FALSE) t.test(RestingHeartRate$Before,RestingHeartRate$After,var.equal = TRUE) OUTPUT > #Second

RestingHeartRate <- read.csv("~/R/DataSets/RestingHeartRate.csv") View(RestingHeartRate) RestingHeartRate=subset(RestingHeartRate,select=c(Before,After)) RestingHeartRate Before After 1 68 67 2 76 77 3 74 74 4 71 74 5 71 69 6 72 70 7 75 71 8 83 77 9 75 71 10 74 74 11 76 73 12 77 68 13 78 71 14 75 72 15 75 77 16 84 80 17 77 74 18 69 73 19 75 72 20 65 62 str(RestingHeartRate) 'data.frame':20 obs. of 2 variables: $ Before: int 68 76 74 71 71 72 75 83 75 74 ...

$ After : int 67 77 74 74 69 70 71 77 71 74 ...

means_RestingHeartRate = sapply(RestingHeartRate,mean) stdev_RestingHeartRate = sapply(RestingHeartRate,sd) n = sapply(RestingHeartRate,length) qt(0.975,n-1) Before After 2.093024 2. interval = qt(0.975,n-1)*stdev_RestingHeartRate/sqrt(n) plotCI(x=means_RestingHeartRate, uiw =interval, barcol="blue",

main=expression(paste("Confidence Interval Plot of difference between the heart rate of people before and after ",alpha,"=5%")))

t.test(RestingHeartRate$Before,RestingHeartRate$After,var.equal = FALSE) Welch Two Sample t-test data: RestingHeartRate$Before and RestingHeartRate$After t = 1.6219, df = 37.57, p-value = 0. alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -0.5470543 4. sample estimates: mean of x mean of y 74.5 72. t.test(RestingHeartRate$Before,RestingHeartRate$After,var.equal = TRUE) Two Sample t-test data: RestingHeartRate$Before and RestingHeartRate$After t = 1.6219, df = 38, p-value = 0. alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -0.5460218 4. sample estimates: mean of x mean of y 74.5 72. Using the confidence intervals of each of the before and after tests, we can conclude that there is no significant difference between the pulse periods at rest and this is because the intervals coincide.

View(AcademicSalaries) AcademicSalaries=na.omit(AcademicSalaries) AcademicSalaries=as.data.frame(AcademicSalaries) str(AcademicSalaries) 'data.frame': 45 obs. of 3 variables: $ Subject: int 1 1 1 1 1 1 1 1 1 1 ... $ Degree : int 1 1 2 2 3 3 3 3 3 3 ... $ Salary : num 1.7 1.9 1.8 2.1 2.5 2.7 2.9 2.5 2.6 2. ...

attr(, "na.action")= 'omit' Named int [1:97] 46 47 48 49 50 51 52 53 54 55 ... ..- attr(, "names")= chr [1:97] "46" "47" "48" "49" ...

AcademicSalaries$Subject <- as.factor(AcademicSalaries$Subject) AcademicSalaries$Degree <- as.factor(AcademicSalaries$Degree) str(AcademicSalaries) 'data.frame': 45 obs. of 3 variables: $ Subject: Factor w/ 4 levels "1","2","3","4": 1 1 1 1 1 1 1 1 1 1 ... $ Degree : Factor w/ 3 levels "1","2","3": 1 1 2 2 3 3 3 3 3 3 ... $ Salary : num 1.7 1.9 1.8 2.1 2.5 2.7 2.9 2.5 2.6 2. ...

attr(, "na.action")= 'omit' Named int [1:97] 46 47 48 49 50 51 52 53 54 55 ... ..- attr(, "names")= chr [1:97] "46" "47" "48" "49" ...

summary(AcademicSalaries) Subject Degree Salary 1:12 1:10 Min. :1. 2:13 2:13 1st Qu.:2. 3:11 3:22 Median :2. 4: 9 Mean :2. 3rd Qu.:3. Max. :3. Subject_Salary=split(AcademicSalaries$Salary,AcademicSalaries$Subject) str(Subject_Salary) List of 4 $ 1: num [1:12] 1.7 1.9 1.8 2.1 2.5 2.7 2.9 2.5 2.6 2. ... $ 2: num [1:13] 2.5 2.3 2.6 2.4 2.7 2.4 2.6 2.4 2.5 3. ... $ 3: num [1:11] 2.7 2.8 2.9 3 2.8 2.7 3.7 3.6 3.7 3. ... $ 4: num [1:9] 2.5 2.6 2.3 2.8 3.3 3.4 3.3 3.5 3. means_Subject_Salary = sapply(Subject_Salary,mean) stdev_Subject_Salary = sapply(Subject_Salary,sd) ggplot(AcademicSalaries,aes(x=Subject,y=Salary,fill=Subject))+geom_boxplot() summary(AcademicSalaries) Subject Degree Salary 1:12 1:10 Min. :1. 2:13 2:13 1st Qu.:2. 3:11 3:22 Median :2. 4: 9 Mean :2. 3rd Qu.:3. Max. :3. my_anova2=aov(Salary~Subject+Degree+Subject*Degree, data=AcademicSalaries) summary(my_anova2) Df Sum Sq Mean Sq F value Pr(>F) Subject 3 4.168 1.389 63.85 7.9e-14 *** Degree 2 8.382 4.191 192.63 < 2e-16 *** Subject:Degree 6 0.044 0.007 0.34 0. Residuals 33 0.718 0.

Signif. codes: 0 ‘’ 0.001 ‘’ 0.01 ‘’ 0.05 ‘.’ 0.1 ‘ ’ 1

Tukey=TukeyHSD(my_anova2) Tukey

Tukey multiple comparisons of means

95% family-wise confidence level Fit: aov(formula = Salary ~ Subject + Degree + Subject * Degree, data = AcademicSalaries) $Subject $Degree $Subject:Degree