Without knowing the context of the problem in the book from which it came, it is difficult to know what three tests the author of the question may have in mind.
I will give details about a couple of possibilities that seem reasonable, and I hope that will be
helpful.
Background and data. It is important to note that we have a sample of size $n = 45,$ mean $\bar X = 4.0,$ and standard deviation $S = 1.$ And we have a population (unknown size, presumably very large) with mean $\mu = 3.5$ and standard deviation
$\sigma = \sqrt{44} = 6.633.$ We are asked to 'compare the two mean values'.
I think you are right that we are supposed to compare the mean $\mu_1$ of the
actual population from which the 45 lambs were chosen (perhaps one cooperative
of farms) with the population mean $\mu_2$ for the entire 'race'. That would be
a one sample test of $H_0: \mu_1 = \mu_2$ against the alternative
$H_a: \mu_1 \ne \mu_2.$
But an unusual question arises: Do we use the SD $S = 1$ from the relatively small sample, or the 'race' SD $\sigma = 6.633$. Without knowing a lot more
that I do about lambs and the factors that may affect variability in their
weights, I don't know which to choose.
T-test. If I use $\bar X$ and $S$ from the sample, I have a one-sample t test.
The test statistic is $T = (\bar X - \mu_2)/(S/\sqrt{n}) = (4 - 3.5)/(1/\sqrt{45}) = 3.354.$ Here $T$ has Student's t distribution with
$n - 1 =44$ degrees of freedom, so the 'critical value' for a test at
the 5% level of significance is 2.014. This means we would reject $H_0$
because $|T| = 3.3364 > 2.014.$
In Minitab 17 statistical software, the test looks like this:
One-Sample T
Test of μ = 3.5 vs ≠ 3.5
N Mean StDev SE Mean 95% CI T P
45 4.000 1.000 0.149 (3.700, 4.300) 3.35 0.002
From this information, we would reject $H_0$ because the P-value is less than 0.5 = 5%.
Z-Test. If I use $\bar X$ from the sample and $\sigma = 6.633$ from the sample, I have a one-sample t test.
The test statistic is $Z = (\bar X - \mu_2)/(\sigma/\sqrt{n}) = (4 - 3.5)/(6.633/\sqrt{45}).$ Here $Z$ has a standard normal distribution, so the 'critical value' for a test at
the 5% level of significance is 1.96. One would reject $H_0$
if $|Z| > 1.96, but not here.$ Minitab output is as follows:
One-Sample Z
Test of μ = 3.5 vs ≠ 3.5
The assumed standard deviation = 6.633
N Mean SE Mean 95% CI Z P
45 4.000 0.989 (2.062, 5.938) 0.51 0.613
Here, we would not reject $H_0$ at the 5% level because the P-value exceeds 0.05.
I'm not quite sure about the meaning of the phrase 'estimate the level of
significance'. It is possible that this means to compare the two P-values.
Some authors refer to P-values as 'achieved' significance levels.
Interpretation. We got different results from the two tests. How might
that make sense?
From the point of view of the co-op where the 45 lambs
were born, they might say their lambs weigh higher than average for 'race'.
Moreover, because of the very small SD, they might believe this is not a
one-time 'fluke', and that they are doing something that gets heavier lambs.
From the point of view of raisers of lambs nationwide, they might say nothing
special is going on here. The larger SD for the 'race' reflects differences
in local methods of breeding and caring for lambs.
Other tests? I don't immediately see what third test the textbook author may have in mind.
Perhaps s/he is thinking of using a one sided-alternative. Perhaps the
normality of the data are in doubt and some kind of nonparametric test is
intended.
Disparity in variability. Finally, I should mention that another major difference between the sample of
45 and the 'race' is the SD. There are formal statistical tests to check
whether $S = 1$ is significantly smaller than $\sigma = 6.633.$ But I won't
go into that because the question is clearly focused on comparing means.
Addendum (pursuant to Comments): Here is a possible third test.
Theoretically, it is a bit of a reach, so consider it with care.
Take
the sample of $n_1 = 45$ sheep as Sample 1. Then consider that the information
about this 'race' of sheep must have originated with a very large
sample, say of size $n_2 = 2000$ (purely fictional, possibly reasonable) with $\bar X_2 = 3.5$ and $S_2 = 6.633.$
Also, consider that $n_1$ and $n_2$ are both large enough to use the
sample SDs as excellent estimates of population SDs.
With these (somewhat shaky and potentially controversial) assumptions, we could consider the test of a significant difference between $\bar X_1 = 4.0$ and $\bar X_2 = 3.5$ to be a two sample z-test. Then the test statistic is
$Z = (\bar X_1 - \bar X_2)/SE,$ where $SE^2 = \sigma_1^2/n_1 + \sigma_2^2/n_2$
and one would reject $H_0$ if $|Z| > 1.96.$ The Minitab output is as follows:
As far as I know, there is no Minitab procedure for this. Arithmetic (and probability computation) in R
statistical software follows: I get $Z = 2.378,$ (which you might check on a calculator). So we reject at the 0.05 level.
The P-value is $0.017 < 0.05.$
(Notice that the value of $S_2$ doesn't matter much because $S_2^2$ is
divided by a huge $n_2.$ But if I take $n_2 = 1000,$ then $Z = 1.94,$ and
we are just a bit short of rejection.)
a1 = 4; s1 = 1; n1 = 45
a2 = 3.5; s2 = 6.633; n2 = 2000
SE = sqrt(s1^2/n1 + s2^2/n2)
z = (a1 - a2)/SE; z
## 2.377704
2*pnorm(-2.377704)
## 0.0174208