Quick Start Guide¶
This guide will walk you through the basic usage of the gigaspatial
package. By the end of this guide, you will be able to download, process, and store geospatial data using the package.
Prerequisites¶
Before you begin, ensure that you have installed the gigaspatial
package. If you haven't installed it yet, follow the Installation Guide.
Importing the Package¶
Start by importing the gigaspatial
package:
Setting Up Configuration¶
The gigaspatial
package uses a unified configuration system to manage paths, API keys, and other settings.
- Environment Variables: Most configuration is handled via environment variables, which can be set in a
.env
file at the project root. For a full list of supported variables and their descriptions, see the Configuration Guide. - Defaults: If not set, sensible defaults are used for all paths and keys.
- Manual Overrides: You can override data directory paths in your code using
config.set_path
.
Example .env
File¶
BRONZE_DIR=/path/to/your/bronze_tier_data
SILVER_DIR=/path/to/your/silver_tier_data
GOLD_DIR=/path/to/your/gold_tier_data
VIEWS_DIR=/path/to/your/views_data
CACHE_DIR=/path/to/your/cache
ADMIN_BOUNDARIES_DIR=/path/to/your/admin_boundaries
MAPBOX_ACCESS_TOKEN=your_mapbox_token_here
# ... other keys ...
Setting Paths Programmatically¶
from gigaspatial.config import config
config.set_path("bronze", "/path/to/your/bronze_tier_data")
config.set_path("gold", "/path/to/your/gold_tier_data")
config.set_path("views", "/path/to/your/views_data")
For more details and troubleshooting, see the full configuration guide.
Downloading and Processing Geospatial Data¶
The gigaspatial
package provides several handlers for different types of geospatial data. Here are examples for two commonly used handlers:
GHSL (Global Human Settlement Layer) Data¶
The GHSLDataHandler
provides access to various GHSL products including built-up surface, building height, population, and settlement model data:
from gigaspatial.handlers import GHSLDataHandler
# Initialize the handler with desired product and parameters
ghsl_handler = GHSLDataHandler(
product="GHS_BUILT_S", # Built-up surface
year=2020,
resolution=100, # 100m resolution
)
# Download data for a specific country
country_code = "TUR"
downloaded_files = ghsl_handler.load_data(country_code, ensure_available = True)
# Load the data into a DataFrame
df = ghsl_handler.load_into_dataframe(country_code, ensure_available = True)
print(df.head())
# You can also load data for specific points or geometries
points = [(38.404581,27.4816677), (39.8915702, 32.7809618)]
df_points = ghsl_handler.load_into_dataframe(points, ensure_available = True)
Google Open Buildings Data¶
The GoogleOpenBuildingsHandler
provides access to Google's Open Buildings dataset, which includes building footprints and points:
from gigaspatial.handlers import GoogleOpenBuildingsHandler
# Initialize the handler
gob_handler = GoogleOpenBuildingsHandler()
# Download and load building polygons for a country
country_code = "TUR"
polygons_gdf = gob_handler.load_polygons(country_code, ensure_available = True)
# Download and load building points for a country
points_gdf = gob_handler.load_points(country_code, ensure_available = True)
# You can also load data for specific points or geometries
points = [(38.404581, 27.4816677), (39.8915702, 32.7809618)]
polygons_gdf = gob_handler.load_polygons(points, ensure_available = True)
Storing Geospatial Data¶
You can store the processed data in various formats using the DataStore
class from the core.io
module. Here's an example of saving data to a parquet file:
from gigaspatial.core.io import LocalDataStore
# Initialize the data store
data_store = LocalDataStore()
# Save the processed data to a parquet file
with data_store.open("/path/to/your/output/processed_data.parquet", "rb") as f:
processed_data.to_parquet(f)
If your dataset is already a pandas.DataFrame
or geopandas.GeoDataFrame
, write_dataset
method from the core.io.writers
module can be used to write the dataset in various formats.
from gigaspatial.core.io.writers import write_dataset
# Save the processed data to a GeoJSON file
write_dataset(data=processed_data, data_store=data_store, path="/path/to/your/output/processed_data.geojson")
Visualizing Geospatial Data¶
To visualize the geospatial data, you can use libraries like geopandas
and matplotlib
. Here's an example of plotting the processed data on a map:
import geopandas as gpd
import matplotlib.pyplot as plt
# Load the GeoJSON file
gdf = gpd.read_file("/path/to/your/output/processed_data.geojson")
# Plot the data
gdf.plot()
plt.show()
geopandas.GeoDataFrame.explore
can also be used to visualise the data on interactive map based on GeoPandas
and folium/leaflet.js
:
Next Steps¶
Now that you have a basic understanding of how to use the gigaspatial
package, you can explore more advanced features and configurations. Check out the User Guide for detailed documentation and examples.
Additional Resources¶
- API Documentation: Detailed documentation of all classes and functions.
- Examples: Real-world examples and use cases.
- Changelog: Information about the latest updates and changes.