category | column_name | data_type | example | comment |
---|---|---|---|---|
idenfification | file_seg_name | text | Segment32 | Segment identifier |
idenfification | file_section_id | number | 724 | SectionID |
idenfification | file_section_name | text | BIRDNAME ROAD | Name of Section |
idenfification | file_loc_from | number | 430 | Start metre |
idenfification | file_loc_to | number | 1445 | End metre |
idenfification | file_lane_name | text | All | lane code |
quantity | file_length | number | 1015 | Length of segment in metres |
quantity | file_area_m2 | number | 10332 | Square metre area |
trigger | file_can_treat_flag | text | TRUE | Can this segment be considered for treatment (change as needed based on client policy) |
trigger | file_can_rehab_flag | text | FALSE | Can this segment be considered for Rehab (client specific change as needed based on client policy) |
trigger | file_ac_ok_flag | text | TRUE | Is the pavement suitable for asphalt resurfacing (based on current delfection/remaining pavement life/condition - subject to client policy/thresholds) |
surfacing | file_surf_class | text | cs | Surface Class (must be one of: 'cs' or 'ac', 'blocks','concrete','other') |
trigger | file_next_surf | text | ac | What is the replacement surfacing type ('cs' or 'ac', 'blocks','concrete','other') (change as needed based on client policy) |
trigger | file_earliest_treat_period | number | 1 | Specify the ealiest possible modelling period that the first treatment may be triggered (flag for fine control of treatment selection on certain elements) |
road | file_urban_rural | text | U | Urban/Rural Tag |
road | file_onrc | text | secondary collector | ONRC Category |
traffic | file_adt | number | 2463 | Average Daily Traffic |
traffic | file_heavy_perc | number | 5 | Heavy Vehicle Percentage |
traffic | file_no_of_bus_routes | number | 1 | Number of Bus Routes - Can be used in MCDA Model to lend greater weight to roads with more bus routes |
traffic | file_traff_growth_perc | number | 2 | Traffic Growth Percent |
surfacing | file_surf_date | text | 13/02/2003 | Surfacing Date dd/mm/yyyy (will be use to determine Surfacing Age using 'base_date' value in 'General' Lookup set) |
surfacing | file_surf_function | text | 2 | Surface Function |
surfacing | file_surf_material | text | RACK | Surfacing Material |
surfacing | file_surf_life_expected | number | 10 | Surfacing Expected life from RAMM. Important factor that determines surface remaining life and plays a role in S-Curve factors for distresses |
surfacing | file_surf_layer_no | number | 1 | Surfacing layer number |
surfacing | file_surf_thick | number | 9 | Surfacing thickness, in mm |
pavement | file_pave_date | text | 8/02/1976 | Pavement Construction Date dd/mm/yyyy (will be use to determine Pavement Age using 'base_date' value in 'General' Lookup set) |
pavement | file_pave_remlife | number | 20 | Age based pavement remaining life |
maint_fault | file_su_fault_qty | number | 0 | Surfacing Faults in Square Metres (open dispatches) -plays a role in calculation of Surfacing Distress Index |
maint_fault | file_pa_fault_qty | number | 0 | Pavementy Faults in Square Metres (open dispatches) -plays a role in calculation of Pavement Distress Index |
hsd | file_rough_survey_date | text | 27/03/2023 | Roughness Survey Date dd/mm/yyyy (use to determine Survey Age using 'base_date' value in 'General' Lookup set). Used in turn to determine if survey is outdated. |
hsd | file_naasra_85 | number | 110 | Naasra 85th Percentile |
hsd | file_rut_survey_date | text | 27/03/2023 | Rut Survey Date dd/mm/yyyy (use to determine Survey Age using 'base_date' value in 'General' Lookup set). Used in turn to determine if survey is outdated. |
hsd | file_rut_lwpmean_85 | number | 4 | LWP Mean Rut 85th percentile |
hsd | file_rut_rwpmean_85 | number | 4 | RWP Mean Rut 85th percentile |
distress | file_cond_survey_date | text | 18/03/2023 | Condition survey Date dd/mm/yyyy (use to determine Survey Age using 'base_date' value in 'General' Lookup set). Used in turn to determine if survey is outdated. |
distress | file_pct_allig | number | 0 | Alligator/Mesh Cracks percent of segment length |
distress | file_pct_lt_crax | number | 1.06 | L&T Cracks percent of segment length |
distress | file_pct_poth | number | 4.3E-3 | Potholes percent of segment length |
distress | file_pct_scabb | number | 0.15 | Scabbing percent of segment length |
distress | file_pct_flush | number | 0 | Flushing percent of segment length |
distress | file_pct_shove | number | 0 | Shoving percent of segment length |
distress | file_pct_edgebreak | number | 0 | Edge Breaks percent of segment length |
Input Data
Overview
This page outlines the required structure for the model input data. The model input set is a .CSV file containing specific column names with their data for all network elements. The preparation of the model input file is a vital task in the modelling pipeline. You should take all reasonable care to ensure that the input data is up-to-date and accurate.
Typically, the preparation of the Model Input data can be done through a combination of Data Joins within your Asset Management System. The JunoAMS, for example, has a sophisticated data join feature that allows you to create Join Parameters that tie certain columns in your database to an output set. For more information on JunoAMS’s data join feature see this link.
Input Set Preparation
Regardless of what tools you use to create the basis for your input data, you may need to make project specific adjustments of certain flags and parameters in the input set based on the specific policies of the network you are working on.
For example, the input data contains several TRUE/FALSE flags, such as “file_can_rehab_flag” which indicates whether an element can be considered for treatment or not. You can set a default value to TRUE to all elements for such flags, but this should then be updated using a script or spreadsheet formula to apply client policies. For example, a client may have a policy not to consider rehabilitation on roads with an Average Daily Traffic (ADT) volume of less than 120 vehicles per day. In such a case, you should apply a script or Excel formula to update the “file_can_rehab_flag” column values based on the values in “file_adt”.
Another aspect of data preparation you should handle in the preparation of your input data set is how to handle missing data. The Juno Cassandra model will throw an error if any columns contain missing data, or if numeric columns contain non-numeric data.
You should therefore include logic in your pre-processing steps to ensure that you assign reasonable defaults to cells that have missing data. You can use imputation algorithms such as those available in the R-language to make educated guesses about missing data, or you can keep it simple and assign averages (or modes in the case of text data) based on ONRC category etc.
While it is possible to encapsulate some client policies (as well as rules for handling missing values) in the Domain Model itself, we have deliberately chosen to put client-specific policy related logic, and rules for assigning default values, in the input preparation stage rather than in then model itself.
This not only simplifies the model logic significantly, but also gives the modeller the freedom to use their own tools and skills to prepare an input set that fully encapsulates client policies and preferences related to default values regardless of how complex these rules may be.
Input Data Required Columns
The table below lists all of the required columns in the model input set. Note that the model input logic considers column names to be case-sensitive. Thus you should ensure that your input data contains the exact column names as given below.
The Cassandra Framework Model will throw an error if any cells in your input set are empty or does not contain the correct data type. Specifically, numeric columns should not contain any empty values or text values that cannot be converted to numbers.