Firstly, by ii), we can reduce $M$ to $\mathbb{R}^d$, where $d=\dim M$. To see this, given $m\in M$, there exists a coordinate chart $(U,\varphi)$, such that $U\ni m$ is open in $M$ and $\varphi:U\to \mathbb{R}^d$ is diffeomorphic, i.e. $U$ can be identified with $\mathbb{R}^d$. Moreover, there exists a smooth function $f:M\to\mathbb{R}$, such that $f=1$ on $\varphi^{-1}(B(0,1))$ and $f=0$ outside $\varphi^{-1}(B(0,2))$, where $B(0,r)$ is the ball in $\mathbb{R}^d$ centered at $0$ with radius $r$. Then by ii), $\nabla_{fX}Y=f\nabla_XY$, i.e. the value of $\nabla_XY$ at $m$ depends only on $fX$. Since $fX$ vanishes outside $U$, we can reduce $M$ to $\mathbb{R}^d$.
Now assume that $M=\mathbb{R}^d$ with coordinates $x^1,\dots,x^d$. Then for each vector fied $X$ of $\mathbb{R}^d$, there exist smooth functions $X^1,\dots,X^d:\mathbb{R}^d\to \mathbb{R}$, such that $X=\sum_{i=1}^dX^i\frac{\partial}{\partial x^i}$. Then by i) and ii), $\nabla_XY=\sum_{i=1}^dX^i\nabla_{\frac{\partial}{\partial x^i}}Y$, whose value at $m$ depends only on $X(m)=\sum_{i=1}^dX^i(m)\frac{\partial}{\partial x^i}$.