I think Bayes' theorem is intuitive if you just multiply through with the denominator. The formula is
$$P(A, B) = P(A|B)P(B) = P(B|A)P(A).$$
This is, in a way, the definition of conditional probability. Look at the first one:
$$P(A, B) = P(B|A)P(A).$$
There are many equivalent ways to interpret this. Let me give you one example.
Let $A$ be the event "I am hungry", and $B$ the event "I go to a restaurant". Note that it's possible for me to be hungry and not go to a restaurant, and it's also possible for me to go to a restaurant without being hungry. But there is a correlation like this: The chance of me going to restaurant increases if I'm hungry. That means $P(B|A) > P(B)$. (This is only true in this example!) $P(B|A)$ is called the conditional probability because it is conditioned on $A$. It is the probability of $B$ happening given the knowledge that $A$ happens.
The joint probability is $P(A, B)$, the chance of both $A$ and $B$ happening. If I know $P(A)$ and $P(B|A)$, I can compute $P(A, B)$ as follows. First, think about how probable $A$ can happen. That is $P(A)$. And assuming $A$ happens, how probable is it that $B$ also happens? That is $P(B|A)$ by definition. Multiplying them together, I get the probability that both $A$ and $B$ happen.