Skip to content

Asynchronous materialization of samples view#7734

Open
XingY wants to merge 29 commits into
developfrom
fb_asyncSamplesTable
Open

Asynchronous materialization of samples view#7734
XingY wants to merge 29 commits into
developfrom
fb_asyncSamplesTable

Conversation

@XingY

@XingY XingY commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Rationale

On server start or after sample designer update, the first request to load the sample grid for a large sample type triggers a synchronous full table materialization, which can take a long time if the number of sample rows is large. If the user navigates away before it completes, the HTTP request is cancelled mid-flight, aborting the SELECT INTO. The materialized view is never populated — subsequent visits repeat the same blocking behavior indefinitely.

This change decouples view population from the request. When the materialized view is not ready (never built, or stale due to pending incremental updates), the request falls back immediately to direct JOINs and submits a background rebuild. The background task runs to completion regardless of user navigation, so the next visit uses the fast materialized path.

Related Pull Requests

Changes

  • ExpMaterialTableImpl: replaced the synchronous getMaterializedSQL() call with the async-aware path.

{
// view became stale after the snapshot was taken, trigger a rebuild and fall back
mqh.materializeAsync();
sql.append(getJoinSQL(selectedColumns));

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of repeated logic here. Consider just initializing usedMaterialized to false and then always applying getJoinSQL(selectedColumns) when it is false after this check. Example:

@Override
public SQLFragment getFromSQLExpanded(String alias, Set<FieldKey> selectedColumns)
{
    SQLFragment sql = new SQLFragment("(");
    boolean usedMaterialized = false;

    // SELECT FROM
    /* NOTE We want to avoid caching in paths where the table is actively being updated (e.g. loadRows)
     * Unfortunately, we don't _really_ know when this is, but if we in a transaction that's a good guess.
     * Also, we may use RemapCache for material lookup outside a transaction
     */
    boolean onlyMaterialColums = false;
    if (null != selectedColumns && !selectedColumns.isEmpty())
        onlyMaterialColums = selectedColumns.stream().allMatch(fk -> fk.getName().equalsIgnoreCase("Folder") || null != _rootTable.getColumn(fk));
    if (!onlyMaterialColums && null != _ss && null != _ss.getTinfo()
            && !getExpSchema().getDbSchema().getScope().isTransactionActive())
    {
        _MaterializedQueryHelper mqh = getOrCreateMQH();
        if (mqh != null)
        {
            // Snapshot the isReadyToUse() decision on first call; reuse on all subsequent calls
            // within the same TableInfo instance (i.e., the same query-construction scope).
            // This prevents a race where a background build completes between two getFromSQL
            // calls for the same lookup target, which would otherwise produce inconsistent SQL
            // fragments (one materialized, one not) for the same table alias.
            Boolean ready = _mqhReadySnapshot;
            if (ready == null)
            {
                ready = mqh.isReadyToUse();
                _mqhReadySnapshot = ready;
                if (!ready)
                    mqh.materializeAsync();
            }

            if (ready)
            {
                // The view may have been invalidated between the snapshot decision and now.
                // tryGetFromSqlIfLoaded re-checks LoadingState and needsSynchronousWork without
                // blocking; returns null if the view is no longer usable. In debug builds,
                // LookupColumn.declareJoins may call this path twice for the same alias — if the
                // view is invalidated between the two calls the second will return null and
                // debugCompareSQL will fail. This is accepted; see the _mqhReadySnapshot javadoc.
                SQLFragment tempRef = mqh.tryGetFromSqlIfLoaded("_cached_view_");
                if (tempRef != null)
                {
                    sql.append(new SQLFragment("SELECT * FROM ").append(tempRef));
                    usedMaterialized = true;
                }
                else
                {
                    // view became stale after the snapshot was taken, trigger a rebuild and fall back
                    mqh.materializeAsync();
                }
            }
        }
    }

    if (!usedMaterialized)
        sql.append(getJoinSQL(selectedColumns));

    // WHERE
    SQLFragment filterFrag = getFilter().getSQLFragment(_rootTable, null);
    sql.append("\n").append(filterFrag);
    if (_ss != null && !usedMaterialized)
    {
        if (!filterFrag.isEmpty())
            sql.append(" AND ");
        else
            sql.append(" WHERE ");
        sql.append("CpasType = ").appendValue(_ss.getLSID());
    }
    sql.append(") ").appendIdentifier(alias);

    return getTransformedFromSQL(sql);
}

* This is cheap: it creates the MQH configuration but does NOT trigger a SELECT INTO or any incremental SQL.
* Returns null if there is no sample type ({@code _ss} is null).
*/
private _MaterializedQueryHelper getOrCreateMQH()

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Annotate @Nullable



@RequiresPermission(ApplicationAdminPermission.class)
public static class ClearMaterializedViewsAction extends FormHandlerAction<QueriesForm>

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just use the caches page for clearing this instead of needing a whole new action? It should already be listed in the caches and clearable.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I didn't realize this can already be cleared from caches page.

* Returns true if the global (non-transactional) materialized view is LOADED and has no pending synchronous work.
* Use this to decide whether to use the fast materialized path or fall back to direct JOINs.
*/
public boolean isReadyToUse()

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems redundant to have isReadyToUse() and tryGetFromSqlIfLoaded()? Is there a caller of isReadyToUse() that doesn't immediately want to proceed to use the table?

sql = tryGetFromSqlIfLoaded();
if (null == sql)
     start asyncLoading
     sql = getJoinSQL(selectedColumns)
...

@XingY XingY Jun 10, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a case when there is a lookup defined on samples table, getJoinCondition is called again even though it was previously already determined:

if (assertEnabled || !map.containsKey(colTableAlias))

Could we actually remove the "assert" check here? Absent of it, the lookup join sql won't be attempted again and we can simplify the materialization logic for samples.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants