Given that the Wiki definition may be too mathematically formal for the OP, let me give some intuition of the partial differential equation starting from the first order case. 
First, consider a first order ordinary differential equation
$$ \frac{\mathrm{d}}{\mathrm{d}t} X = F(t,X) $$
where $X$ takes values in, say, $\mathbb{R}^n$ and $F(t,X)$ is some Lipschitz continuous function. (In other words, this is a dynamical system.) What it says is that it tells us how $X$ ought to change, at an instant in time $t$, based on the time $t$ and the current value of $X$. This is what we call an "evolutionary point of view". 
The analogue of a partial differential equation that is "evolutionary" is an equation for $X$, which now depends on not only the time $t$ but also some spatial coordinates $(x_1, \ldots, x_N)$ would be something like
$$ \frac{\partial}{\partial t} X = F\left(t,x_1, \ldots, x_N, X, \frac{\partial}{\partial x_1} X, \ldots, \frac{\partial}{\partial x_N} X\right) $$
Now, we have that how $X$ ought to change at an instant in time $t$ and at position $(x_1,\ldots, x_N)$, is based on a function of not only the coordinate values of $t$ and $(x_1, \ldots, x_N)$, but also the value of $X$ at that space-time point, and also the value of its spatial directional derivatives at that space-time point. 
There is a different point of view, however, for ordinary differential equations. This is the "constraint point of view". For this we consider the equation 
$$ X'' = F(s,X) $$
and try to solve it while prescribing boundary conditions $X(0) = f_1$ and $X(1) = f_2$. What we should think of is that the differential equation describe some "compatibility condition" for a certain physical system in stasis. For example, the above equation can be used to describe the distribution of temperature along a rod that is kept at temperature $f_1$ at one end and temperature $f_2$ at the other. The equation says that the second derivative of the temperature function depends on the physical characteristic of the rod at the point $s$ as well as the current temperature at that point $x$. In other words, the laws of nature constrains what temperature profiles are possible. 
From this point of view, we also get a type of partial differential equations that describes a constraint. In this case, the PDE is usually written as an analytic expression relating the various partial derivatives of a function. What this says is that for the question we are considering, not all functions are admissible as solutions. That some law (most frequently a physical law) requires that the only admissible functions describing the situation (this is a constraint) obey certain relationships imposed upon their Taylor coefficients up to some order $k$ at every point. In other words, the function is not allowed to wiggle willy-nilly. Its rates of changes between the various different directions are tied together. 
Intuition aside, the mathematical formulation of a PDE can be stated relatively simply. 
A partial differential equation is a equation which expresses an equality between expressions involving partial derivatives of a given function. More precisely, taking one of the simpler cases, a partial differential equation on a scalar function $u$ defined on some subset $U\subseteq \mathbb{R}^N$ is the equation 
$$ F(x,u,\nabla u, \nabla^2 u, \ldots , \nabla^k u) = 0 $$
where $x\in U\subseteq \mathbb{R}^N$ are the independent variables, $\nabla^ju$ are the tensors representing the $j$-th fold partial derivatives of $u$ ($\nabla^2 u$ is the Hessian matrix, $\nabla u$ is the gradient vector), and $F$ is some function 
$$ F: U \times \mathbb{R} \times \mathbb{R}^N \times \mathbb{R}^{N^2} \times \cdots \times \mathbb{R}^{N^k} \to \mathbb{R} $$
The number $k$, the maximum order of the derivatives involved in the equation, is called the "order" of the equation. 
For some simple examples:
- The transport equation (or linear advection equation) are cases where $k = 1$, and where $$ F(x,u,p) = V(x)\cdot p $$
where $p\in \mathbb{R}^N$ and $V(x)$ is some vector field on $U$.  
- The Laplace equation is when $k = 2$ and $$F(x,u,p,q) = \operatorname{trace} q $$ where $p\in\mathbb{R}^N$ and $q\in \mathbb{R}^{N^2}$ is interpreted as an $N\times N$ matrix.  
- The wave equation is when $k = 2$ and $$F(x,u,p,q) = \operatorname{trace} q - T^\dagger q T $$ where $\dagger$ is the matrix transpose, $T$ is a vector with $\|T\|^2 > 1$ 
- The linear Schroedinger equation is also when $k = 2$ and $$F(x,u,p,q) = \operatorname{trace} q - T^\dagger q T - i T\cdot p $$ where $T$ is a vector with $\|T\|^2 = 1$. If we remove the imaginary $i$ from the equation, we end up with the linear heat equation instead. Note that necessarily for Schrodinger's equation we need $u$ to take values in the complex number $\mathbb{C}$, and so its gradient and Hessian will be complex-valued vector and complex-valued matrix.  
And now, for an extremely high-brow definition (which is a bit beyond the "beginner's scope" asked by the original poster, but nonetheless interesting):
A partial differential relation (of which a partial differential equation is a special type) for a fibre-bundle $F$ over some smooth manifold $M$ is a subset $\mathcal{R}\subseteq F^{(r)}$ of the $r$-th jet bundle of $F$ over $M$. A partial differential equation is one where $\mathcal{R}$ has co-dimension 1. To bring it back to the simplest case defined above the cut: a class of simple fibre-bundles are the trivial bundles $F = M\times N$. Here $M$ is the domain of independent variables (what is $U$ in the definition above). $N$ is the domain of dependent variables (what is $\mathbb{R}$ or $\mathbb{C}$ above, but we can also think of vector valued dependent variables taking values in, say, $\mathbb{R}^n$ or $\mathbb{C}^n$, then we get what are sometimes called systems of partial differential equations). The $k$-th jet bundle is, roughly speaking, the set of all possible $k$-th order Taylor expansions; in other words, it represents the space $\mathbb{R}\times \mathbb{R}^N \times \mathbb{R}^{N^2}\times \cdots \times \mathbb{R}^{N^k}$ of the value of the function and all its (partial) derivatives up to order $k$. 
Then the single equation $F(x,u,p,q,r,\ldots,s) = 0$, the partial differential equation, should carve out a codimension 1 subset of $U \times \mathbb{R} \times \cdots \times\mathbb{R}^{N^k}$. (See my question on MO for some tangentially related discussions.) 
Further readings
- Sergiu Klainerman's essay, an abridged version of which appeared in the Princeton Companion to Mathematics. It assume a little bit more than absolute beginner, but not too much more.  
- Jürgen Jost's Partial Differential Equations textbook, while on the whole may be a bit too advanced for the OP, has a short introductory chapter titled "What are Partial Differential Equations?", which should also give some intuition.  
- Ka Kit Tung's Partial differential equations and Fourier analysis - A short introduction is a textbook aimed at students who have had one year of calculus and one course of ordinary differential equations. It has a decent first chapter reviewing ODEs, and a second chapter explaining the physical origins of partial differential equations while comparing and contrasting them to ordinary differential equations which the OP understood better. This may be a reasonable first book for the OP to consult.