Ricardo A. Pasquini
  • Research / Investigacion
  • Teaching / Docencia
  • About
  • github
  • CV
  • Blog
January 21, 2020 by admin 0
Coding Notes, Uncategorized

Implementing a scalable geospatial operation in MongoDB

Implementing a scalable geospatial operation in MongoDB Summary In this note I document an initial test implementation of a spatial join involving 22 millions of points to nearly 16 thousands polygons using MongoDB. I document the necessary steps to run the operation. My results took more time that I expected, a total of more than 12 hours. My conclusion is that the approach can be scalable if combined with other approaches such as the simplification of polygons. Intro In this post, I am sharing an implementation of a spatial join type of analysis at scale using MongoDB. MongoDB is a Non-SQL database system, which is extensively used in industry to store large databases distributed over multiple (cloud) machines for storing files. My case is the analysis of a large database over 22 million geo-located tweets. My first objective is to implement a spatial join kind of analysis, that essentially counts tweets in censal radiuses, which are spatial polygons. In this case I have 15,700 polygons. Such an operation is standardly implemented in geospatial packages such as Arcgis or Qgis, and in Python, for example, using Geopandas. But my ultimate objective is finding a solution that is scalable with large amounts…

Read more

geopandas Geospatial analysis MongoDB pymongo python scalability spatial join

April 29, 2019 by admin 0
Coding Notes, Uncategorized

Mapping with geopandas and basemapping with contextily

I find the geopandas library to be really useful for mapping with layers. Contextily is also a nice library that allows adding a background basemap. Using them together makes it fairly simple to visualize shapes such as polygons and points, together with contextual mapping information, such as in the following figure: Basemaps are drawn from OpenStreetMap under CC BY SA and map tiles are from Stamen Design, under CC BY 3.0. There are some  options for tile design. View the code on Gist. If embedded notebook does not render try here

contextily geopandas mapping python

2/2

Categories

  • Causal Inference (4)
  • Coding Notes (8)
  • Defi (1)
  • Economics (8)
  • Machine Learning (1)
  • Uncategorized (19)
  • Urban Economics (3)

Recent Posts

  • Note on the Impact of Liquidation on Health Factor in Overcollateralized Loans
  • Econometrics with simulations 📚
  • Explicando Inferencia por Aleatorización a un futbolero
  • Optimal calibration of a ML classifier based on business knowledge
  • Note on AMMs “picked-off” risk

Tag Cloud

AMM apps bienes publicos causal inference classification conda COVID-19 criptomonedas cryptocurrencies defi econometrics economía de mercados Ethereum Exportar resultados Export output Financial inclusion financiamiento cuadrático fraud geopandas Geospatial analysis Gitcoin h3 hexagons Households Finance Indebtedness Jupyter Loops machinelearning Mercado de Alquileres MongoDB negocios precision proyectos ingeniería public goods pymongo python quadratic funding recall Regression roc-curve scalability Stata Tablas Tables ubuntu
Rife WordPress Theme ♥ Proudly built by Apollo13Themes - Edit this text

Recent Posts

  • Note on the Impact of Liquidation on Health Factor in Overcollateralized Loans
  • Econometrics with simulations 📚
  • Explicando Inferencia por Aleatorización a un futbolero
  • Optimal calibration of a ML classifier based on business knowledge
  • Note on AMMs “picked-off” risk
  • Un atlas de deudas para Argentina
  • Bienes públicos, Gitcoin, y financiamiento cuadrático
  • An indebtedness atlas for Argentina

Tags

AMM apps bienes publicos causal inference classification conda COVID-19 criptomonedas cryptocurrencies defi econometrics economía de mercados Ethereum Exportar resultados Export output Financial inclusion financiamiento cuadrático fraud geopandas Geospatial analysis Gitcoin h3 hexagons Households Finance Indebtedness Jupyter Loops machinelearning Mercado de Alquileres MongoDB negocios precision proyectos ingeniería public goods pymongo python quadratic funding recall Regression roc-curve scalability Stata Tablas Tables ubuntu