Social Explorer includes a wide range of demographic, economic, social, environmental, and thematic datasets. When a user submits a question, the Data Navigator Assistant automatically selects the dataset that can produce the most accurate, complete, and reliable answer. This decision is based on geographic requirements, variable availability, time coverage, and methodological consistency.
How the Data Navigator Selects a Dataset
The Data Navigator first evaluates the geographic level requested. Some datasets support only certain geographies. For example, ACS 1-year estimates are available only for larger populations, while ACS 5-year estimates support all geographic levels, including census tracts and block groups. For small or detailed areas, the Data Navigator selects datasets that provide full coverage.
Next, the Data Navigator checks which datasets contain the required variables. Many variables exist only in specific tables or only for certain years. If a variable is not available in a 1-year dataset, the Data Navigator automatically switches to a 5-year dataset or another appropriate source.
If a year or multi-year comparison is requested, the Data Navigator selects datasets that support compatible timeframes and survey methodologies to ensure valid comparisons. For non-ACS topics such as crime, health, or environmental data, the Data Navigator selects relevant datasets from the broader Social Explorer Data Library.
Examples of Dataset Selection
If a user asks for the population of Miami in 2020, the Data Navigator selects ACS 2020 5-year estimates for full city coverage. If a user compares median income between Los Angeles and New York in 2022, the Data Navigator uses the ACS 2022 1-year estimates when both cities qualify. For median income by census tract in Cook County, the Data Navigator selects ACS 5-year estimates because tract-level data is not available in 1-year datasets.
Fallback Logic and Transparency
When the initially selected dataset cannot fulfill the request, the Data Navigator applies fallback rules. This may include switching from 1-year to 5-year data, adjusting the geographic level, selecting an alternative variable, or explaining why the request cannot be completed.
Every Data Navigator response includes a Sources panel showing the dataset, table, variables, geography, and relevant methodology notes. This ensures complete transparency, allowing users to understand and verify how the dataset was selected.