This is to advocate simultaneously for some more rigorous notations and for an easy way to solve these questions by using as much as possible the wonders of linear algebra.
First, if $U$ is a function defined on $\mathbb R^n$ with values in $\mathbb R$, the notation $\frac{d}{dx}U(x)$ is strange when $n\geqslant2$ because $\frac{d}{dx}$ usually denotes the derivative operator applied to a function from $\mathbb R$ to $\mathbb R$. For example, if $u$ is defined on $\mathbb R$ by $u(x)=\mathrm e^{x^2}$, I know that $\frac{d}{dx}u(x)=2x\mathrm e^{x^2}$ but if $U$ is defined on $\mathbb R^n$ by $U(x)=\mathrm e^{\|x\|^2}$, I do not know the meaning of the notation $\frac{d}{dx}U(x)$ when $n\geqslant2$.
Here, one looks for the gradient $\nabla U(x)$ of $U$ at $x$. In relaxed terms, $\nabla U(x)$ is a vector in $\mathbb R^n$ but in fact $\nabla U(x)$ is a linear form defined on $\mathbb R^n$ (this $\mathbb R^n$ is the tangent vector space of the manifold $\mathbb R^n$ at $x$). Rigorously speaking, $\nabla U(x):\mathbb R^n\to\mathbb R$ is defined by the fact that, for every vector $v$ in $\mathbb R^n$, $ \nabla U(x)(v)=\lim\limits_{h\to0}\frac1h(U(x+hv)-U(x)), $ if the limit exists. This defines a linear function $\nabla U(x):\mathbb R^n\to\mathbb R$ and the identification of this function with an element $w$ of $\mathbb R^n$ comes through the identification of the tangent space of the manifold $\mathbb R^n$ at $x$ with the vector space $\mathbb R^n$ itself through the choice of a vector basis. This basis $B$ defines a scalar product on $\mathbb R^n$ by $(v_1,v_2)\mapsto v_1^Tv_2$ thanks to the decompositions of $v_1$ and $v_2$ in $B$, and one gets the relation $ \nabla U(x)(v)=w^Tv. $ Such a vector $w$ is often denoted $w=\mathrm{grad}\ U(x)$ and both $\nabla$ and $\mathrm{grad}$ are pronounced gradient. Thus writing $v$ and $w$ in the basis $B$ as $v=(v_i)_i$ and $w=\left(\frac{\partial U}{\partial x_i}(x)\right)_i$, one gets $ \nabla U(x)(v)=w^Tv=\sum\limits_iw_iv_i=\sum\limits_i\frac{\partial U}{\partial x_i}(x)\ v_i. $ To sum up everything above:
- To compute $\nabla U(x)=w$ is to write $U(x+hv)=U(x)+hw^Tv+o(h)$ for every $v$ in $\mathbb R^n$ when $h$ in $\mathbb R$ goes to $0$.
- The fact that $\nabla U(x)=w$ is equivalent to the fact that $\nabla U(x)(v)=w^Tv$ for every vector $v$, which is equivalent to the fact that $\dfrac{\partial U}{\partial x_i}(x)=w_i$ for every $i$.
Let us now compute the gradient of your examples. We will make a heavy use of the fact that for every matrices $C$ and $D$ of suitable dimensions, $(CD)^T=D^TC^T$ and of the fact that $z^T=z$ for every $1\times1$ matrix (also known as a number), but of pretty much nothing else. Here we go.
If $U(x)=b^TAx$, $U(x+hv)-U(x)=h(b^TAv)=h(A^Tb)^Tv$ hence $\nabla U(x)=A^Tb$.
If $U(x)=x^TAb$, $U(x+hv)-U(x)=h(v^TAb)=h(Ab)^Tv$ hence $\nabla U(x)=Ab$.
If $U(x)=x^TAx$, $U(x+hv)-U(x)=h(v^TAx+x^TAv)+h^2v^TAv$ hence $\nabla U(x)(v)=v^TAx+x^TAv=(Ax)^Tv+(A^Tx)^Tv$ hence $\nabla U(x)=Ax+A^Tx=(A+A^T)x$.
To transform these considerations into some concrete formulas, let us compute the coordinates of the gradient in cases 1. and 3. In case 1., one gets $ \frac{\partial U}{\partial x_i}(x)=(A^Tb)_i=\sum\limits_j(A^T)_{ij}b_j=\sum\limits_jA_{ji}b_j, $ and in case 3., $ \frac{\partial U}{\partial x_i}(x)=((A+A^T)x)_i=\sum\limits_j(A+A^T)_{ij}x_j=\sum\limits_j(A_{ij}+A_{ji})x_j. $