Suppose we have (an interval of) a time series of measurements:

We assume it can be explained as a "simple" underlying signal overlaid by noise. I'm interested in finding a good algorithm to estimate the value of the simple signal at a given point in time -- primarily for the purpose of displaying it to human users who know, more or less, how to interpret the signal but would be distracted by the noise.
For a human observer of the plot it looks very clearly like the underlying signal has a jump discontinuity at about $t=18$. But that's a problem for automatic noise removal, because the techniques I know are all predicated of a "nice" underlying signal meaning a "smooth" one. A typical anti-noise filter would be something like colvolving with a Gaussian kernel:

which completely fails to convey that the left slope of the dip is any different from the right one. My current solution is to use a "naive" rolling average (i.e. convolution with a square kernel):

whose benefit (aside from simplicity) is that at least the sharp bends in the signal estimate alert the viewer that something fishy is going on. But it takes quite some training for the viewer to know what fishy thing this pattern indicates. And it is still a tricky business to pinpoint when the abrupt change happened, which is sometimes important in my application.
The widths of the convolution kernels in the two examples above were chosen to give about the same smoothing of the pure noise (since I've cheated and actually constructed the sample data I'm showing as the sum of a crisp deliberate signal and some explicit noise). If we make them narrower, we can get the estimate to show that there's an abrupt change going on, but then they don't remove all of the noise:

I can't be the first person ever to face this problem. Does it ring a bell for anyone? I'd appreciate ideas, pointers to literature, search terms, a conventional name for the problem, whatever.
Miscellaneous remarks:
- No, I cannot rigorously define what it is I want to optimize for. That would be too easy. 
- It will be nice if the smoothed signal can show clearly that there's no significant systematic change in the signal before the jump at $t=18$. 
- In the example I show here, the jump was much larger than the amplitude of the noise. That's not always the case in practice. 
- I don't need real-time behavior, so it is fine that the estimate of the signal at some time depends on later samples. In fact I'd prefer to find a solution that commutes with time inversions. 
- Basing a solution on outside knowledge about how the particular signal I'm looking at ought to behave is not an option. There are too many different measurements I want to apply it to, and often it's something we don't have a good prior expectation for. 
