Import
How data enters AVM
AVM uses a staged import model so inventory data can be validated, previewed, and preserved with its original context. Importing software is the start of the workflow, not the end of it.
Import is not the end of the process
AVM imports inventory through staged workflows. Asset data and software data are first validated into staging records before being committed into the main operational tables. This helps keep import behavior visible and reviewable.
Software import preserves raw values such as vendor, publisher, product, and version information. Those raw values are important because canonical resolution and vulnerability matching happen after import, not instead of import.
Key point: successful import does not mean the software is already canonically resolved or fully ready for matching.
Import flow
Import staging exists to make inventory ingestion inspectable, not to hide it inside one opaque upload action.
Assets first, then software
AVM expects software rows to be tied to assets. In practice, this means asset records should exist before software is imported against them.
The asset acts as the host anchor for later software review and alert generation, so software import depends on asset identity being available.
Asset import
Creates the host-level records that software rows will attach to.
Software import
Adds observed software rows linked to known asset identity.
How assets and software are linked
In AVM import, software rows are not linked to assets by internal IDs.
Instead, both asset and software JSON use the same
external_key to establish the relationship.
This means that the asset and its software must share the same stable external identifier during import.
Minimal asset example
{
"external_key": "example-uuid-0001",
"name": "server-01"
}
Minimal software example
{
"external_key": "example-uuid-0001",
"product": "Docker Desktop",
"vendor": "Docker Inc.",
"version": "4.60.1"
}
During import, AVM resolves this shared external_key to an
internal asset_id. The software row is then stored as a record
linked to that asset.
The external_key should remain stable across imports. If it
changes, AVM will treat the asset as a different system.
Key idea: external_key is the import-side
identity that ties asset and software together. AVM converts it into
internal relationships after validation.
Example osquery queries
The following examples show how similar data can be collected using
osquery. In practice, these results are transformed into
AVM JSON format before import.
Asset-side example
osqueryi --json "
SELECT
uuid AS external_key,
hostname AS name
FROM system_info;
"
The system_info table provides stable host identity such as
uuid and hostname, which can be mapped to
AVM asset fields.
Software-side example (Windows)
osqueryi --json "
SELECT
(SELECT uuid FROM system_info LIMIT 1) AS external_key,
name AS product,
publisher AS vendor,
version
FROM programs;
"
The programs table provides installed software inventory.
The external_key is typically injected during transformation
to link software rows to the corresponding asset.
Using osquery as a data source
AVM does not require a specific data source, but tools like osquery are commonly used to collect asset and software inventory.
osquery exposes system information as SQL tables. These tables can be queried and transformed into AVM-compatible JSON for import.
Asset-side data
Tables such as system_info and
os_version provide host identity, OS details,
hardware, and CPU information that map to AVM asset fields.
Software-side data
On Windows, the programs table provides installed
software inventory, including name, version, publisher, and
install location.
Key idea: osquery data is typically transformed
into AVM JSON rather than imported directly. The transformation
step maps osquery fields to AVM fields and assigns a stable
external_key.
Learn more about osquery: Official documentation
What AVM can store for assets and software
Before thinking about import-source fields, it helps to understand what AVM is actually designed to preserve. The point of richer import is not to collect every possible field from the source.
In other words, richer source data only matters when it maps to actual AVM fields.
Asset record in AVM
The assets table stores host identity and context:
ownership, platform, OS, hardware, CPU, and system metadata.
external_key
name
asset_type
owner
note
source
platform
os_name
os_version
os_build
os_major
os_minor
os_patch
arch
system_uuid
serial_number
hardware_vendor
hardware_model
hardware_version
board_vendor
board_model
board_version
board_serial
cpu_brand
cpu_physical_cores
cpu_logical_cores
cpu_sockets
physical_memory
computer_name
hostname
local_hostname
last_seen_at
Software record in AVM
The software_installs table stores observed software
with both raw evidence and canonical linkage fields.
asset_id
type
source
source_type
vendor_raw
product_raw
version_raw
publisher
vendor
product
version
version_norm
normalized_vendor
normalized_product
cpe_name
cpe_vendor_id
cpe_product_id
install_location
installed_at
package_identifier
install_source
package_manager
bundle_id
edition
channel
release_label
purl
last_seen_at
import_run_id
canonical_link_disabled
Key idea: import should populate fields that AVM can actually preserve and use later for linking, review, and matching.
Richer asset example
AVM assets can preserve significantly more host context than the
minimal external_key plus name shape.
This becomes useful when your source already knows operating
system, hardware, CPU, and host identity details and you want that
context to remain available inside AVM.
{
"arch": "64-bit",
"asset_type": "endpoint",
"computer_name": "example-host-01",
"cpu_brand": "Intel(R) Core(TM) m3-6Y30 CPU @ 0.90GHz",
"cpu_logical_cores": "2",
"cpu_physical_cores": "2",
"cpu_sockets": "1",
"external_key": "example-uuid-0001",
"hardware_model": "VirtualBox",
"hardware_vendor": "innotek GmbH",
"hardware_version": "-1",
"hostname": "example-host-01",
"local_hostname": "example-host-01",
"name": "example-host-01",
"os_build": "26200",
"os_major": "10",
"os_minor": "0",
"os_name": "Microsoft Windows 11 Pro",
"os_version": "10.0.26200",
"owner": "example-team",
"physical_memory": "-1",
"platform": "windows",
"serial_number": "example-serial-1234",
"source": "OSQUERY",
"system_uuid": "example-uuid-0001"
}
This maps naturally to AVM asset-side fields in the
assets table: host identity
(external_key, name,
hostname, computer_name,
local_hostname, system_uuid,
serial_number), platform and OS
(platform, os_name,
os_version, os_build,
os_major, os_minor,
os_patch, arch), hardware
(hardware_vendor, hardware_model,
hardware_version, board_vendor,
board_model, board_version,
board_serial), and CPU or memory context
(cpu_brand, cpu_physical_cores,
cpu_logical_cores, cpu_sockets,
physical_memory).
Richer software example
AVM software rows can also preserve much richer observed evidence than a minimal vendor / product / version payload.
{
"external_key": "example-uuid-0001",
"install_location": "C:\\Program Files\\Docker\\Docker",
"last_seen_at": "2026-03-19 00:05:43",
"product_raw": "Docker Desktop",
"publisher": "Docker Inc.",
"source": "osquery",
"source_type": "osquery",
"type": "application",
"vendor_raw": "Docker Inc.",
"version_raw": "4.60.1"
}
This maps to AVM software-side fields in the
software_installs table. AVM can preserve raw
evidence such as vendor_raw,
product_raw, version_raw, and
publisher, provenance such as source,
source_type, and import_run_id,
installation context such as install_location,
installed_at, package_identifier,
install_source, package_manager, and
product-shape metadata such as type,
arch, edition,
channel, release_label,
bundle_id, and purl.
AVM also has normalized and canonical fields such as
vendor, product, version,
version_norm, normalized_vendor,
normalized_product, cpe_name,
cpe_vendor_id, and cpe_product_id, but
those should be understood as later operational fields rather than
something a source system must always provide directly.
How osquery data maps into AVM records
osquery is usually not exported into AVM one-to-one. Instead, a transformation step reads osquery tables and builds AVM-shaped asset and software JSON that matches the fields AVM can actually store.
Asset-side mapping
Host-oriented osquery tables can be transformed into one AVM
asset record per system. Typical target fields are
external_key, name,
hostname, computer_name,
system_uuid, platform,
os_name, os_version,
os_build, arch,
hardware_vendor, hardware_model,
cpu_brand, and related host metadata.
Software-side mapping
Installed-software osquery tables can be transformed into AVM
software rows tied to that asset through the same stable
external_key. Typical target fields are
type, source,
source_type, vendor_raw,
product_raw, version_raw,
publisher, install_location,
installed_at, install_source, and
package_identifier.
Why this matters
The goal is not just to ingest source data. The goal is to populate AVM's own operational model in a way that preserves evidence and supports later review, linking, and matching.
Typical osquery sources
When AVM import JSON is built from osquery, the final payload is usually assembled from multiple osquery tables rather than copied from a single query result as-is.
A common pattern is to produce one asset object per host from
host-side tables, and many software objects for that same host
from installed-software tables. Both sides are then tied together
through the same external_key.
Asset-side tables
Tables such as system_info,
os_version, and cpu_info are useful
for building AVM asset records.
In practice, these tables can provide host identity, platform,
OS version, hardware, CPU, and related host metadata that map
naturally into the assets table.
Software-side tables
On Windows, programs is typically the most direct
source for installed software inventory.
Fields such as software name, version, publisher, install location, install source, install date, and identifying number can then be mapped into AVM software-side fields.
Transformation step
The final AVM import files are usually transformed outputs, not raw osquery tables. That transform step is where source-side field names are mapped into AVM's asset and software model.
Example field mapping from osquery to AVM
Asset-side examples
| osquery field | AVM field |
|---|---|
system_info.hostname |
hostname or name |
system_info.uuid |
system_uuid or stable external_key |
system_info.cpu_brand |
cpu_brand |
system_info.cpu_physical_cores |
cpu_physical_cores |
system_info.cpu_logical_cores |
cpu_logical_cores |
system_info.cpu_sockets |
cpu_sockets |
system_info.physical_memory |
physical_memory |
system_info.hardware_vendor |
hardware_vendor |
system_info.hardware_model |
hardware_model |
os_version.name |
os_name |
os_version.version |
os_version |
os_version.major |
os_major |
os_version.minor |
os_minor |
os_version.patch |
os_patch |
os_version.build |
os_build |
os_version.platform |
platform |
os_version.arch |
arch |
Software-side examples
| osquery field | AVM field |
|---|---|
programs.name |
product_raw |
programs.version |
version_raw |
programs.publisher |
publisher; optionally vendor_raw |
programs.install_location |
install_location |
programs.install_source |
install_source |
programs.install_date |
installed_at |
programs.identifying_number |
package_identifier |
Exact transformation rules are up to your import pipeline. AVM does not require osquery-specific field names, but it works best when stable asset identity and raw software evidence are preserved.
Accepted field styles
AVM accepts common JSON naming patterns used by inventory and integration tooling. This reduces the need for users to reshape data aggressively before import.
Common style
Snake case such as external_key,
product_raw, and version_raw.
Also supported
Camel case forms used by some integrations, especially where import-side keys need to map to the same operational field.
The goal is not to encourage inconsistent naming. It is to make import practical for real inventory sources.
Why raw values are preserved
AVM stores imported software as observed inventory, not as if it had already been normalized perfectly. Raw vendor, publisher, product, and version values may still be needed for review, canonical linking, alias creation, and auditability.
This is important because inventory data is rarely perfectly clean. Preserving raw evidence makes it possible to improve resolution later without losing the original source-side view.
Design principle: import should preserve evidence, not erase it.
Import staging
AVM uses explicit staging entities for asset and software import. This means uploads can be validated and previewed before they affect the main inventory records.
ImportRun
Tracks a specific import execution and ties together staged rows, status, and later review context.
ImportStagingAsset
Holds staged asset rows before import commit.
ImportStagingSoftware
Holds staged software rows before import commit.
Why this matters
Operators can inspect what is valid, what is invalid, and what will be imported before the final step.
Import does not guarantee canonical resolution
A software row can be imported successfully and still remain unresolved from a canonical perspective. This is expected.
Import answers the question “what was observed and accepted into the system?” Canonical resolution answers the different question “what reference identity does this software correspond to?”
Import success means
The row passed import validation and became part of the operational inventory.
Import success does not mean
The row is already fully normalized, canonically linked, or ready for perfect vulnerability matching.
Choosing Replace or Append during software import
AVM allows two import modes depending on how you want to manage software inventory over time.
Append rows
Use this when you want to add newly observed software without removing existing data.
Typical use:
- A new application was installed on an asset
- You are collecting data from multiple sources
- You want to accumulate observations over time
Behavior:
- Existing software records remain unchanged
- New rows are added
- Alerts for existing software are not affected
Replace asset software
Use this when the import represents the current full state of each asset.
Typical use:
- Scheduled inventory updates (e.g. daily scan)
- Synchronizing with a source of truth (CMDB, full scan)
Behavior:
- Existing software is replaced by the imported set
- Software not present in the new import is removed
- Alerts linked to removed software are automatically closed on the next recalculation
Important: alert updates are not performed during import. After importing software (especially when using Replace), you should run Generate Alerts to synchronize alert state with the current software inventory.
Recommended operational flow
- Import software
-
Append → incremental update
Replace → full state refresh - (Optional) Review linking / unresolved mappings
- Run: Generate Alerts
Key idea: Append grows the dataset (observation-oriented), while Replace reflects the current state (state-oriented). Generate Alerts synchronizes security results with that state.
What happens after import
After software is imported, AVM continues with canonical resolution and review workflows. Depending on the software row, that may include dictionary resolution, alias or synonym help, unresolved mapping review, canonical backfill, and later alert recalculation.
When unresolved entries are expected
Unresolved rows are normal when software naming is noisy, package metadata is incomplete, vendor names differ from the canonical dictionary, or product strings are too broad to map safely on first pass.
AVM keeps those rows visible because they are part of the real import result, not an exceptional corner case to hide.
Why version information helps
Product identity is not the only thing that matters after import. Version information can strongly affect whether a vulnerability applies. Providing version values at import time makes later matching more useful and reduces unnecessary ambiguity.
With version data
AVM can perform stronger version-aware evaluation in later matching steps.
Without version data
Canonical identity may still be useful, but downstream applicability decisions may be less precise.
Import sources and provenance
AVM keeps track of import-side context such as source and source type. This helps preserve how a row entered the system and supports later review, troubleshooting, and data-quality work.
In practice, import data may come from manual preparation, inventory scripts, platform exports, or tooling such as osquery. AVM benefits from that context instead of discarding it.
Example
A software upload may contain a product name that is valid enough to import but not specific enough to resolve to a canonical product immediately. AVM should still preserve the row, attach it to the correct asset, and make it available for unresolved review.
Later, after alias or synonym improvements and canonical backfill, that same software row may become matchable against vulnerability conditions without needing to re-import the raw evidence.
What AVM is trying to avoid
One-step opaque ingestion
Import should remain visible and reviewable, not disappear behind a single all-or-nothing action.
Discarding raw source evidence
Source-side values may still be important after import.
Equating import with normalization
Getting data into the system is different from resolving its canonical identity.
Hiding incomplete coverage
Unresolved rows should remain visible so the system can be improved iteratively.
Summary
Import in AVM is a staged process that preserves observed inventory, validates it visibly, and commits it into the system without pretending that canonical identity and vulnerability applicability are already solved.
That makes import a reliable operational starting point. Assets and software enter the platform with their original context intact, and the system can then improve coverage through canonical linking, unresolved review, backfill, and matching.