Tribhuvan University

Institute of Science and Technology

2078

Bachelor Level / third-semester / Science

Computer Science and Information Technology( STA215 )

Statistics II

Full Marks: 60 + 20 + 20

Pass Marks: 24 + 8 + 8

Time: 3 Hours

Candidates are required to give their answers in their own words as far as practicable.

The figures in the margin indicate full marks.

Group A

Attempt any TWO questions

1

There are three brands of computers namely Dell, Lenovo, and HP. The following are the lifetime of 15 computers in years.

 Serial Number Computer Brand Lifetime in years 1 Dell 15 2 Lenovo 10 3 HP 9 4 Dell 12 5 Lenovo 6 6 HP 7 7 Dell 4 8 Lenovo 8 9 HP 13 10 Dell 11 11 HP 5 12 Lenovo 7 13 Dell 3 14 HP 5 15 Lenovo 4

Apply appropriate statistical tests to identify whether the average lifetime (in years) is significantly different across three rands of computers at a 5% level of significance. You can again tabulate the data initially in the required format for statistical analysis.

2

Explain the sample distribution of mean with reference to some numerical example. Illustrate the practical implications of the Central Limit Theorem (CLT) in inferential statistics.

3

A study was conducted among IT officers working in different IT Centers in Kathmandu valley. One of the objectives of the study was to quantify the effect of age and working hours per day on Computer Vision Syndrome (CVS). The CVS was measured in a continuum measurement scale varying from 0 to 50. A few parts of the data were taken randomly from the surveyed data and provided in the following table for the statistical analysis.

 Respondent’s ID 001 007 125 231 99 299 145 Scales of CVS 6 7 5 11 3 29 28 Age of respondents (in years) 24 26 30 41 47 50 52 Working hour(per day) 4 5 6 8 3 6 7

Recognize which one is the dependent variable. Assuming that the relationship between CVS, age, and working hours is linear. Fit a multiple linear regression model to address the objective of the study and interpret the model appropriately.

Group B

Attempt any EIGHT questions

4

The following are the details of working hours in the classroom per week of male and female faculty working in the area of Computer Science and Information Technology at Tribhuvan University.

 Male Faculty Female Faculty Sample Size 60 30 Average working hours per week 12 9 The standard deviation of a working hour per week 4 3

Apply independent t-test to examine the average working hour in the classroom per week is significantly different between male and female faculty, at 1% level of significance. State also null and alternative hypotheses appropriately.

5

A survey was conducted among 70 students studying B.Sc. CSIT in some colleges randomly. Among them, 50 students secured more than 80% marks in statistics. Compute 99% and 95% confidence intervals for the population proportion of students who secured more than 80% marks in subject statistics, and comment on the results.

6

In location 1, there are 250 corona-positive cases out of 460 persons, and in location 2, 250 positive cases were reported out of 650 persons. Can it be concluded that the proportion of corona-positive cases is higher in location 1 compared to location 2? Test at a 10% level of significance.

7

Previous literature has reported that the average age of Bsc.CSIT enrolling students in Tribhuvan University is 22 years. A researcher has doubts about this information and he feels that the average age is less than 22 years. In order ti examine this, the following sample data were collected randomly from the enrolling students of CSIT.

 Age in years 20 19 22 23 19 20 21 20 19 20

Set up null and alternative hypotheses and test whether the researcher’s doubt will be justified. Use 5% level of significance. Assume that the parent population from which samples are drawn is normally distributed.

8

Apply the Mann-Whitney U test for examining the following knowledge score on IT among two groups of IT workers at a 5% level of significance.

 Group A: 5 8 2 7 6 Group B: 9 12 4 6
9

A survey was conduct to see the association between hacking status of the email and the type of email account. The survey has reported the following cross tabulation.

 Type of e-mail account Hacking status Yes No Yahoo 60 15 Gmail 20 120

Do the information provide sufficient evidence to conclude that the type email account and the hacking status is associated? Use Chi-square test at 1% level of significance.

10

State the mathematical model for Statistical analysis for m x m LSD for one observation per experimental unit. Also prepare a dummy ANOVA table for this.

11

Define the Markov chain and introduce its basic notations. Also, explain the characteristics of a Markov chain.

12

Write short notes on the following:

1. The rationale of using the non-parametric statistical test
2. Estimation of minimum size for  the given proportion