Data Model

How AVM is structured internally

AVM separates observed inventory, canonical references, vulnerability intelligence, operational results, and review records into distinct but connected entities. That structure is part of how the system remains inspectable.

The data model explains the system

AVM stores inventory, reference data, and matching results in different layers. Raw software observations are preserved as software records tied to assets. Canonical vendor and product references provide normalized targets. Vulnerability records and criteria define what may be affected. Alerts represent the operational result of evaluating software against those conditions.

Import staging, unresolved mappings, aliases, settings, and audit records are also modeled explicitly rather than hidden in background processing. This makes it easier to understand what the system knows, what it inferred, and what still needs review.

Design principle: the structure of the data should help explain the behavior of the system.

High-level model

At a high level, AVM can be understood as five connected layers:

Assets
Software installs
Canonical references
Vulnerability criteria
Alerts and review records

Import staging and administration support this model rather than sitting outside it.

Core operational entities

Asset

Represents a monitored system such as a server, workstation, laptop, or VM. Assets provide the host context for software and alerts.

SoftwareInstall

Represents observed software on an asset. This is the main operational starting point for canonical linking and vulnerability evaluation.

Vulnerability

Represents imported vulnerability intelligence and related metadata used by downstream matching and prioritization.

Alert

Represents the operational result of evaluating installed software against stored vulnerability conditions.

Assets

The asset is the host-level anchor of the AVM model. It represents the system on which software is observed and where alerts become operationally meaningful.

Assets typically hold identity and context: names, external identifiers, platform and OS information, hardware-related details, and observation timestamps.

Why assets matter

Matching is useful only when results can be tied back to a real system.

Typical role

An asset provides the host context for many software rows and many resulting alerts.

Software installs

SoftwareInstall records represent observed software attached to assets. They preserve the inventory-side view of what was found before the system decides how to map or evaluate it.

This entity is intentionally rich because AVM needs to preserve source context, raw naming, version information, import linkage, and canonical linkage state.

Raw inventory values

Raw vendor, product, publisher, and version values help preserve source evidence and support later review.

Canonical linkage fields

Software rows can reference canonical vendor and product records once they have been linked.

Source and provenance

Import source, source type, package-related information, and timestamps preserve how a row entered the system.

Operational control

Some software rows may be intentionally excluded from canonical linking workflows, which affects downstream matching behavior.

Reference and canonical entities

AVM uses canonical references to avoid relying only on whatever strings appeared in an import. Canonical vendor and product tables provide stable identifiers that connect inventory to vulnerability logic.

CpeVendor

Normalized vendor reference used across many software records, aliases, and vulnerability relationships.

CpeProduct

Normalized product reference under a canonical vendor. This becomes the main bridge into vulnerability matching.

CpeVendorAlias

Explicit mapping from alternative vendor names to canonical vendor identities.

CpeProductAlias

Explicit mapping from alternative product names to canonical products under canonical vendors.

AVM also includes synonym-driven normalization logic to help resolve common naming variation before or during candidate selection and review.

Vulnerability and criteria entities

Vulnerability matching in AVM is not driven only by a flat list of affected products. The data model also stores structured criteria that can be evaluated as conditions.

Vulnerability

Stores the main vulnerability record and supporting metadata used for evaluation and prioritization.

VulnerabilityAffectedCpe

Stores affected canonical pairs used for direct affected-CPE style matching and lookups.

VulnerabilityCriteriaNode

Stores the node structure of a criteria tree so logical conditions can be evaluated rather than flattened away.

VulnerabilityCriteriaCpe

Stores CPE predicates attached to criteria nodes, including the information needed for version-aware evaluation.

This separation matters because a vulnerability may be described by combinations of product, platform, and version conditions rather than by a single simple product string.

Alerts and review entities

Alerts represent the operational outcome of matching, but AVM also preserves review-oriented entities for cases that are not fully resolved.

Alert

Connects a software record and a vulnerability result in a form that operators can review and act on.

UnresolvedMapping

Records software naming cases that have not yet been linked confidently to canonical vendor and product references.

This is an important distinction in AVM: not every unresolved inventory row becomes an alert, and not every review task is hidden behind matching.

Alerts also carry a certainty value. CONFIRMED means AVM could verify that the vulnerability applies with sufficient precision. UNCONFIRMED means the vulnerability may apply, but the available evidence was not enough for full confirmation, typically because of version-related ambiguity.

When an alert is UNCONFIRMED, AVM can also retain an uncertainty reason, such as missing software version, unparseable version, or lack of a usable version constraint in the vulnerability data.

Import and staging

AVM models staged import explicitly. Imported data is first validated and stored in staging entities before becoming part of the main operational inventory.

ImportRun

Tracks an import execution and provides a shared reference for staged data, outcomes, and later review.

ImportStagingAsset

Holds staged asset rows during validation and preview before final import.

ImportStagingSoftware

Holds staged software rows during validation and preview, including inventory-side values that may later need review or canonical resolution.

Why staging exists

It allows AVM to make import behavior visible and reviewable, instead of writing everything directly into the core tables.

Security and administration

AVM also models operational control and traceability explicitly. Administration is not only UI behavior. It is part of the stored system state.

AppUser

Represents an authenticated application user and related account state.

AppRole

Represents application roles used for access control.

SystemSetting

Stores configurable system behavior so important matching and review-related choices remain visible.

SecurityAuditLog

Stores security-relevant administrative and account events for traceability.

How the pieces fit together

A typical operational path in AVM looks like this:

Asset
SoftwareInstall
Canonical vendor/product
Vulnerability criteria
Alert

If canonical linking is incomplete, the software row may instead remain visible through unresolved mapping and review workflows.

Example

Consider an asset that has the following software installed:

  • vendor (raw): Microsoft Corporation
  • product (raw): Microsoft Windows Notepad
  • version (raw): 10.0.19045

This raw record is stored as observed evidence. At this stage, AVM does not assume that these values already represent a stable product identity.

During canonical linking, this row may be linked to:

  • canonical vendor: microsoft
  • canonical product: windows_notepad

Separately, AVM stores vulnerability definitions such as:

  • CVE-XXXX-YYYY affects microsoft:windows_notepad
  • version condition: < 10.0.20000

When alerts are generated, AVM evaluates whether the canonical identity and version of the software satisfy these conditions.

In this example, version 10.0.19045 falls within the affected range, so an alert is created. If the version were higher, or if the product had not been linked correctly, no alert (or an uncertain result) would be produced.

This illustrates the separation: raw observation, canonical identity, vulnerability conditions, and alert generation are connected, but each step is evaluated explicitly.

Why this structure matters

It preserves evidence

Raw inventory values and staging records help explain how a row entered the system.

It preserves reference truth

Canonical tables and vulnerability criteria provide stable, reusable matching inputs.

It preserves review surfaces

Unresolved mappings, aliases, settings, and audit records make operational improvement explicit.

It preserves result traceability

Alerts exist as a distinct operational result rather than an implicit by-product of hidden logic.