Q1. Write an R script to do the following:
a) simulate a sample of 100 random data points from a normal distribution with mean 100 and standard deviation 5 and store the result in a vector.
b) visualize the vector created above using different plots.
c) test the hypothesis that the mean equals 100.
d) use wilcox test to test the hypothesis that mean equals 90.
Q2. Using the Algae data set from package DMwR to complete the following tasks.
a) create a graph that you find adequate to show the distribution of the values of algae a6.
b) show the distribution of the values of size 3.
c) check visually if oPO4 follows a normal distribution.
d) produce a graph that allows you to understand how the values of NO3 are distributed across the sizes of river.
e) using a graph check if the distribution of algae a1 varies with the speed of the river.
f) visualize the relationship between the frequencies of algae a1 and a6. Give the appropriate graph title, x-axis and y-axis title.
Q3. Read the file Coweeta.CSV and write an R script to do the following:
a) count the number of observations per species.
b) take a subset of the data including only those species with at least 10 observations.
c) make a scatter plot of biomass versus height, with the symbol colour varying by species, and use filled squares for the symbols. Also add a title to the plot, in italics.
d) log-transform biomass, and redraw the plot.
Q4. The built-in data set mammals contain data on body weight versus brain weight. Write R commands to:
a) Find the Pearson and Spearman correlation coefficients. Are they similar?
b) Plot the data using the plot command .
c) Plot the logarithm (log) of each variable and see if that makes a difference.
Q5. In the library MASS is a dataset UScereal which contains information about popular breakfast cereals. Attach the data set and use different kinds of plots to investigate the following relationships:
a) relationship between manufacturer and shelf
b) relationship between fat and vitamins
c) relationship between fat and shelf
d) relationship between carbohydrates and sugars
e) relationship between fibre and manufacturer
f) relationship between sodium and sugars
Q6. Write R script to:
a) Do two simulations of a binomial number with n = 100 and p = .5. Do you get the same results each time? What is different? What is similar?
b) Do a simulation of the normal two times. Once with n = 10, µ = 10 and σ = 10, the other with n = 10, µ = 100 and σ = 100. How are they different? How are they similar? Are both approximately normal?
Q7. Create a database medicines that contains the details about medicines such as {manufacturer, composition, price}. Create an interactive application using which the user can find an alternative to a given medicine with the same composition.
Q8. Create a database songs that contains the fields {song_name, mood, online_link_play_song}. Create an application where the mood of the user is given as input and the list of songs corresponding to that mood appears as the output. The user can listen to any song form the list via the online link given.