Two features we wanted to add to our database structure were versioning and source record tie-backs. That makes each insert pretty complicated. The following diagram is a simplified (and obscured) version of what I've been working with.
So after playing around with various fancy tricks using CTE inserts I gave up and went to a simple in-line function. We're using Postgres for this experiment. The first test run used 378,755 records from two sources and had a total run time of 6 hours 32 minutes and 14.917 seconds. That is pretty pathetic.
We put the queries into DataGrip and ran a few SQL Explains and after some tinkering we added six simple indexes to the system. We ended up with something like this;
The end result was amazing!
On the second run, using the same data set, we were able to get the total runtime down to 84.182 seconds. That is a world changing impact for so few indexes.
ProTip: Before you abandon your experiment with the database, make sure you have appropriate indexes.