Collections available in Business Intelligence Database

Use Pentaho Data Catalog

Version
10.1.x
Audience
anonymous
Part Number
MK-95PDC000-02
BIDB hosts several collections, updated regularly by the bi-views service. This service systematically extracts data from all connected databases, organizing it into the following collections available within BIDB.

Checksum Aggregated View

The Checksum Aggregated View collection contains a summary of duplicate files for a specific entity, including their count and total size. The following table shows the details of the data available in this collection.

Field Description Data Type Example Value
_id Checksum derived from bi.entities_master_view. String “968dl402bd0ce783a573al4172c37690”
duplicateFilesCount The total number of duplicate files identified. Integer 3
duplicateFilesSize The total size of duplicate files. Integer 381

Custom Properties View

The Custom Properties View collection contains the details of custom properties in an entity, including their values. The following table shows the details of the data available in this collection.

Field Description Data Type Example Value
_id A unique identifier for the custom property entry. String “65dfc901d04619a9e6a8d62d”
EntityId A unique identifier for the entity. String "11"
PropertyId A unique identifier for the property. String "5"
PropertyName The name of the custom property. String "Name"
Value The assigned value of the custom property. String "John"

Entities Aggregated View

The Entities Aggregated View collection includes the details of the key attributes and values of aggregated entities. The following table shows the details of the data available in this collection.

Field Description Data Type Example Value
_id A unique identifier for the entity aggregation. String “65dfc901d04619a9e6a8d62d”
attribute The attribute name of the entity. String "DataSources"
type The data type of the attribute. String "Structured"
value The value associated with the attribute. String "34"

Entities Extension Count View

The Entities Extension Count view collection includes extension details, such as the file count, data source, and date of recording for each extension. The following table shows the details of the data available in this collection.

Field Description Data Type Example Value
_id A unique identifier for the count entry. String “65c56b02250cc54a7b43943f”
DataSourceFqdnId A fully qualified domain name identifier is required for the data source. String "5"
Date The date when the file count was recorded. Date 2024-02-09T00:00:02.110+00:00
Extension The file extension. String "text/plain; charset=IS0-8859-l;delimiter=comma”
FileCount The number of files with the specified extension. Integer 1

Entities Master View

The Entities Master View contains the essential structure and data field details of an entity. The following table shows the details of the data available in this collection.

       
_id A unique identifier for the entity. String "11"
Name The name of the entity. String "customers"
Type The type of the entity (for example, file, table). String "Table"
Parent The parent entity identifier. String "12"
DataSourceId A unique identifier for the data source. String "dataSource_01"
DataSourceName The name of the data source. String "SalesDB"
DataSourceType The type of the data source (for example, SQL, NoSQL). String "SQL"
ResourceType The type of the resource. String "Database"
DataProfileStatus The status of data profiling (for example, Complete, In Progress). String "Complete"
DataProfiled Whether the data has been profiled (True or False). Boolean True
LastUpdate The timestamp of the last update. Timestamp "2023-12-14T15:05:00Z"
ProductName The name of the product. String "MySQL"
ProductVersion A version of the product. String "8.0"
DriverName The name of the driver used. String "MySQL ODBC 8.0 Driver"
Url The URL associated with the entity. String "jdbc:mysql://example.com/db"
ParentName The name of the parent entity. String "SalesRegion"
TotalTables The total number of tables. Integer 12
TotalColumns The total number of columns. Integer 120
SchemaName The name of the schema. String "public"
DatabaseName The name of the database. String "SalesDB"
LastUpdateStatistics The timestamp of the last statistics update. Timestamp 2023-12-14T14:00:00Z
RowCount Number of rows in the entity. Integer 10000
NullCount Number of nulls in the entity. Integer 50
Cardinality The cardinality of the entity. Integer 9500
Hll HyperLogLog of the entity. String "hll:6a9..."
BlankCount The number of blank entries in the entity. Integer 20
Min The minimum value in the entity. String "1"
Max The maximum value in the entity. String "10000"
AvgValue The average value of the entity. Float 5000.5
MinWidth The minimum width of the entity. Integer 1
MaxWidth The maximum width of the entity. Integer 10
AvgWidth The average width of the entity. Float 5.5
ColumnsCount The count of columns in the entity. Integer 10
Path The path of the entity. String "/data/salesdb/customers"
CheckClause Check the clause of the entity. String "age > 18"
TableName The name of the table. String "customers"
DataType The data type of the entity. String "VARCHAR"
TypeName The name of the entity type. String "varchar"
ColumnSize The size of the column. Integer 255
BufferLength The length of the buffer. Integer 256
DecimalDigits A number of decimal digits. Integer 2
NumPrecRadix A numeric precision radix. Integer 10
IsNullable Whether the entity is nullable (True or False). Boolean True
OrdinalPosition Ordinal position of the entity. Integer 1
IsPrimaryKey Whether the entity is a primary key (True or False). Boolean False
IsForeignKey Whether the entity is a foreign key (True or False). Boolean False
ParentPath The parent path of the entity. String "/data/salesdb"
PathType The path type of the entity. String "Directory"
FileExtension The file extension of the entity. String ".txt"
Size The size of the entity. Integer 2048
Flags Flags associated with the entity. Integer 0
Owner The owner of the entity. String "admin"
Group The group associated with the entity. String "sales"
SymLinkTarget Symbolic link target of the entity. String "/var/salesdb/link"
FileType The file type of the entity. String "Text File"
CreatedAt The timestamp when the entity is created. Timestamp 2021-01-01T12:00:00Z
ModifiedAt The timestamp when the entity is modified. Timestamp 2023-01-01T12:00:00Z
AccessedAt The timestamp when the entity is accessed. Timestamp 2023-01-02T12:00:00Z
ScannedAt The timestamp when the entity is scanned. Timestamp 2023-01-03T12:00:00Z
IsSymlink Whether the entity is a symbolic link (True or False). Boolean False
LinkType The link type of the entity. String “<example>”
PhysicalLocation The physical location of the entity. String "ServerRoom1"
Title The title of the entity. String "2023 Sales Report"
Author The author of the entity. String "John Doe"
Subject The subject of the entity. String "Sales Analysis"
Application An application associated with the entity. String "Microsoft Excel"
Producer The producer of the entity. String "Microsoft"
Version A version of the entity. String "16.0"
DocumentSize The size of the document. Integer 102400
PageSize The size of the page. String "A4"
PageCount Number of pages in the entity. Integer 10
Company The company associated with the entity. String "Acme Corp"
Paragraphs The number of paragraphs in the entity. Integer 50
Lines The number of lines in the entity. Integer 200
Words The number of words in the entity. Integer 1000
Characters The number of characters in the entity. Integer 5000
CharactersWithSpaces The number of characters with spaces in the entity. Integer 6000
Language The language of the entity. String "English"
Checksum The checksum of the entity. String "e4d909c290d0fb1ca068ffaddf22cbd0"
PropertiesChecksum The checksum of the properties of the entity. String "abcd1234efgh5678ijkl9012mnop3456"
ChildDirs The number of child directories. Integer 5
ChildFiles The number of child files. Integer 20
ChildDirSize The size of child directories. Integer 4096
ChildFileSize The size of child files. Integer 8192
TotalChildDirs The total number of child directories. Integer 5
TotalChildFiles The total number of child files. Integer 20
TotalChildDirSize The total size of child directories. Integer 4096
TotalChildFileSize The total size of child files. Integer 8192

Entities Summary View

The Entities Summary View collection contains the details of an entity, such as type and the parent. The following table shows the details of the data available in this collection.

Field Description Data Type Example Value
_id A unique identifier for the summary entry. String “11/XE/SYNTHEA/ALLERGIES”
Name The name of the entity. String "ALLERGIES"
Type The type of the entity (for example, file, table). String "TABLE"
Parent The parent entity identifier. String "11/XE/SYNTHEA"

Entities Temperature Count View

The Entities Temperature View contains entity details emphasizing the categorization of data based on its temperature, which often indicates the frequency of access or modification, including the number of files. The following table shows the details of the data available in this collection.

Field Description Data Type Example Value
_id A unique identifier for the temperature count entry. String "65d2f44dd30b49309488b9dd"
DataSourceFqdnId A fully qualified domain name identifier for the data source. String "5"
Date The date when the file count and temperature were recorded. String 2024-02-19TO6:25:17.918+00:00
FileCount The number of files associated with the specified temperature. String 2
Temperature The temperature category of the data (for example, unclassified, hot, warm, and cold). String "unclassified"

Entity Usage Statistic View

The Entity Usage Statistic View collection includes a range of usage metrics, such as the number of times an entity is read, written to, and altered, along with the timestamp. The following table shows the details of the data available in this collection.

Field Description Data Type Example Value
_id A unique identifier for the statistics view entry. String “11/XE/SYNTHEA/ALLERGIES”
PeriodStartTime The start time of the entity's profiling period, in ISO format. Timestamp 2023-12-14T15:00:00Z
PeriodEndTime The end time of the entity's profiling period, in ISO format. Timestamp 2023-12-14T15:05:00Z
EntityID A unique identifier for the entity. String "12"
DatabaseName The name of the database where the entity is located. String “Postgres”
SchemaName The name of the schema within the database. String “Chinook”
TableName The name of the table containing the entity. String “Album”
LastReadTime The timestamp of the last read operation on the entity, in ISO format Timestamp 2023-12-14T11:00:00Z
LastWriteTime The timestamp of the last write operation on the entity, in ISO format. Timestamp 2023-12-14T11:50:00Z
LastAlterTime The timestamp of the last modification to the entity in ISO format. Timestamp 2023-12-14T11:20:00Z
ReadCount The total number of times the entity has been read. Integer 120
WriteCount The total number of times the entity has been written. Integer 45
AlterCount The total number of times the entity has been altered. Integer 3
CollectionTime The timestamp indicating when this data was collected, in ISO format. Timestamp 2023-12-16T15:00:00Z

Terms View

The Terms View collection contains information of terms related to items. It gives a structured overview, linking terms to specific entities and domains. Each record uniquely identifies a term, its association with an entity, the domain it belongs to, and a unique term identifier, enabling a comprehensive semantic mapping of data assets. The following table shows the details of the data available in this collection.

Field Description Data Type Example Value
_id A unique identifier for the term entry. String "abc12345-d678-90ef-ghij-klmn01234567"
EntityId A unique identifier for the associated entity. String "entity78901-2345-6789-abcd-ef0123456789"
TermName The name of the term. String "Customer Satisfaction Index"
DomainId An identifier for the domain the term belongs to. String "domain1234-5678-90ab-cdef-ghijklmnop"
TermId A unique identifier for the term. String "term5678-9012-3456-7890-abcd12345678"