Asynchronous materialization of samples view#7734
Conversation
| { | ||
| // view became stale after the snapshot was taken, trigger a rebuild and fall back | ||
| mqh.materializeAsync(); | ||
| sql.append(getJoinSQL(selectedColumns)); |
There was a problem hiding this comment.
A lot of repeated logic here. Consider just initializing usedMaterialized to false and then always applying getJoinSQL(selectedColumns) when it is false after this check. Example:
@Override
public SQLFragment getFromSQLExpanded(String alias, Set<FieldKey> selectedColumns)
{
SQLFragment sql = new SQLFragment("(");
boolean usedMaterialized = false;
// SELECT FROM
/* NOTE We want to avoid caching in paths where the table is actively being updated (e.g. loadRows)
* Unfortunately, we don't _really_ know when this is, but if we in a transaction that's a good guess.
* Also, we may use RemapCache for material lookup outside a transaction
*/
boolean onlyMaterialColums = false;
if (null != selectedColumns && !selectedColumns.isEmpty())
onlyMaterialColums = selectedColumns.stream().allMatch(fk -> fk.getName().equalsIgnoreCase("Folder") || null != _rootTable.getColumn(fk));
if (!onlyMaterialColums && null != _ss && null != _ss.getTinfo()
&& !getExpSchema().getDbSchema().getScope().isTransactionActive())
{
_MaterializedQueryHelper mqh = getOrCreateMQH();
if (mqh != null)
{
// Snapshot the isReadyToUse() decision on first call; reuse on all subsequent calls
// within the same TableInfo instance (i.e., the same query-construction scope).
// This prevents a race where a background build completes between two getFromSQL
// calls for the same lookup target, which would otherwise produce inconsistent SQL
// fragments (one materialized, one not) for the same table alias.
Boolean ready = _mqhReadySnapshot;
if (ready == null)
{
ready = mqh.isReadyToUse();
_mqhReadySnapshot = ready;
if (!ready)
mqh.materializeAsync();
}
if (ready)
{
// The view may have been invalidated between the snapshot decision and now.
// tryGetFromSqlIfLoaded re-checks LoadingState and needsSynchronousWork without
// blocking; returns null if the view is no longer usable. In debug builds,
// LookupColumn.declareJoins may call this path twice for the same alias — if the
// view is invalidated between the two calls the second will return null and
// debugCompareSQL will fail. This is accepted; see the _mqhReadySnapshot javadoc.
SQLFragment tempRef = mqh.tryGetFromSqlIfLoaded("_cached_view_");
if (tempRef != null)
{
sql.append(new SQLFragment("SELECT * FROM ").append(tempRef));
usedMaterialized = true;
}
else
{
// view became stale after the snapshot was taken, trigger a rebuild and fall back
mqh.materializeAsync();
}
}
}
}
if (!usedMaterialized)
sql.append(getJoinSQL(selectedColumns));
// WHERE
SQLFragment filterFrag = getFilter().getSQLFragment(_rootTable, null);
sql.append("\n").append(filterFrag);
if (_ss != null && !usedMaterialized)
{
if (!filterFrag.isEmpty())
sql.append(" AND ");
else
sql.append(" WHERE ");
sql.append("CpasType = ").appendValue(_ss.getLSID());
}
sql.append(") ").appendIdentifier(alias);
return getTransformedFromSQL(sql);
}| * This is cheap: it creates the MQH configuration but does NOT trigger a SELECT INTO or any incremental SQL. | ||
| * Returns null if there is no sample type ({@code _ss} is null). | ||
| */ | ||
| private _MaterializedQueryHelper getOrCreateMQH() |
|
|
||
|
|
||
| @RequiresPermission(ApplicationAdminPermission.class) | ||
| public static class ClearMaterializedViewsAction extends FormHandlerAction<QueriesForm> |
There was a problem hiding this comment.
Can we just use the caches page for clearing this instead of needing a whole new action? It should already be listed in the caches and clearable.
There was a problem hiding this comment.
Good point. I didn't realize this can already be cleared from caches page.
| * Returns true if the global (non-transactional) materialized view is LOADED and has no pending synchronous work. | ||
| * Use this to decide whether to use the fast materialized path or fall back to direct JOINs. | ||
| */ | ||
| public boolean isReadyToUse() |
There was a problem hiding this comment.
It seems redundant to have isReadyToUse() and tryGetFromSqlIfLoaded()? Is there a caller of isReadyToUse() that doesn't immediately want to proceed to use the table?
sql = tryGetFromSqlIfLoaded();
if (null == sql)
start asyncLoading
sql = getJoinSQL(selectedColumns)
...
There was a problem hiding this comment.
There is a case when there is a lookup defined on samples table, getJoinCondition is called again even though it was previously already determined:
Could we actually remove the "assert" check here? Absent of it, the lookup join sql won't be attempted again and we can simplify the materialization logic for samples.
Rationale
On server start or after sample designer update, the first request to load the sample grid for a large sample type triggers a synchronous full table materialization, which can take a long time if the number of sample rows is large. If the user navigates away before it completes, the HTTP request is cancelled mid-flight, aborting the SELECT INTO. The materialized view is never populated — subsequent visits repeat the same blocking behavior indefinitely.
This change decouples view population from the request. When the materialized view is not ready (never built, or stale due to pending incremental updates), the request falls back immediately to direct JOINs and submits a background rebuild. The background task runs to completion regardless of user navigation, so the next visit uses the fast materialized path.
Related Pull Requests
Changes