Conversation
|
Some details about how the new approach is different from the old one would be appreciated. |
|
The insert code path now avoids a lot of not necessary steps, such as triggers, validations (except of NOT NULL on field level) index maintenance - all of this is abcent during restore. Also, all records inserted into dedicated in-memory buffer to avoid endless latches on data page buffers. When in-memory buffer is full, its contents is copied into actual DB buffers and go to the disk in usual way. In-memory buffer size start from 1 page and resized to the 8 pages when first 8 pages of relation is filled. Also, blob contens are put into separate in-memory buffer that works in the same way. The code is put into two main classes:
|
|
On 4/15/26 17:55, Vlad Khorsun wrote:
*hvlad* left a comment (FirebirdSQL/firebird#8990)
<#8990 (comment)>
The insert code path now avoids a lot of not necessary steps, such as
triggers, validations (except of NOT NULL on field level) index
maintenance - all of this is abcent during restore. Also, all records
inserted into dedicated in-memory buffer to avoid endless latches on
data page buffers. When in-memory buffer is full, its contents is
copied into actual DB buffers and go to the disk in usual way.
In-memory buffer size start from 1 page and resized to the 8 pages
when first 8 pages of relation is filled. Also, blob contens are put
into separate in-memory buffer that works in the same way.
The code is put into two main classes:
* |BulkInsert| - implements in-memory buffers and code that works
with records (mostly analog/copy of some DPM parts) , and
* |BulkInsertNode| - replaces |StoreNode| and contains some further
optimizations, such as pre-calculated target descriptors.
But what about |relPages->rel_*_data_space| stored at database, not
attachment level? From new code it seems that they should be moved back
to attachment from database? Am I missing something?
|
These fields not used in bulk insert code and thus was not affected in this patch. At next step I'm going to: leave as is: make atomic: remove: |
|
I think term "bulk insert" is a little misleading here. Oracle uses "direct-path insert" for the path where data go straight into data pages. |
In Firebird 5 parallel restore was introduced. It contains a "bulk insert" code that allows to lower contention on PP by concurrent writers. Also it makes each writer to use own dedicated DP to fill with records.
With shared metadata cache in v6 that code became broken and parallel restore creates a lot of unused data pages. Instead of fixing that first attempt to have bulk insert ability, I offer a new approach that fixes the issue, have better performance and could be used more widely.
Note, patch doesn't remove old code, it could be done a bit later - after agreement on the new code.