P3: OpenStreetMap Data Case Study. Dubai and Abu-Dhabi.¶

0. Code Resources¶

0.1. Code Library¶

library(devtools)

library(RColorBrewer)

library(formattable)

library(ggmap)

Loading required package: ggplot2
Google Maps API Terms of Service: http://developers.google.com/maps/terms.
Please cite ggmap if you use it: see citation("ggmap") for details.

library(osmar)

library(RSQLite)

library(sqldf)

library(jsonlite)

library(mongolite)

library(plotly)

library(osmplotr)

library(geojsonio)

library(DT)

# Not installed

# library(RMongo)

# Not used

library(mapview)

library(bigmemory)

library(rio)

library(dygraphs)

library(highcharter)

library(rbokeh)

library(maps)

library(R2HTML)

0.2. Useful links¶

https://wiki.openstreetmap.org/wiki/OSM_XML

https://www.datacamp.com/community/tutorials/r-data-import-tutorial#gs.jUE2UHw

http://www2.uaem.mx/r-mirror/web/packages/osmar/osmar.pdf

https://www.researchgate.net/publication/274740645_Harnessing_open_street_map_data_with_R_and_QGIS

https://cran.r-project.org/web/packages/mongolite/vignettes/intro.html

https://journal.r-project.org/archive/2013-1/eugster-schlesinger.pdf

http://www.joyofdata.de/blog/mongodb-state-of-the-r-rmongodb/

https://edzer.github.io/sp/

https://cran.r-project.org/web/packages/ggmap/ggmap.pdf

https://media.readthedocs.org/pdf/jupyter-notebook/latest/jupyter-notebook.pdf

https://journal.r-project.org/archive/2013-1/kahle-wickham.pdf

https://www.r-bloggers.com/r-and-mongodb/

https://cran.r-project.org/web/packages/mongolite/mongolite.pdf

https://www.r-bloggers.com/r-and-sqlite-part-1/

https://www.datacamp.com/community/tutorials/importing-data-r-part-two#gs._PEI6iY

https://cran.r-project.org/web/packages/rio/vignettes/rio.html

http://flovv.github.io/Gas_price-Mapping/

1. Map Area¶

1.1. The map¶

I have chosen the map sector of the dynamically developing area in the UAE.

For displaying the area I have used the package "ggmap" and the coordinates of this area in dubai_abu-dhabi.osm.

bounds: minlat="23.7350" minlon="53.5800" maxlat="26.5390" maxlon="56.8870"¶

options(repr.plot.width = 9, repr.plot.height = 9)
osmmap <- get_map(location = c(53.5800,23.7350,56.8870,26.5390), source = "osm")
ggmap(osmmap, extent = "normal")

The reader can see some examples of use the ggmap package besides just displaying the maps.

gc01 <- geocode("Jumerah", output = "more")
formattable(data.frame(gc01))

Source : https://maps.googleapis.com/maps/api/geocode/json?address=Jumerah

formattable(data.frame(revgeocode(gc04, output = "more")))

Information from URL : https://maps.googleapis.com/maps/api/geocode/json?latlng=25.2531745,55.3656728

gc02 <- as.numeric(geocode("Jumerah"))
gc02

Source : https://maps.googleapis.com/maps/api/geocode/json?address=Jumerah

formattable(data.frame(revgeocode(gc02, output = "more")))

Information from URL : https://maps.googleapis.com/maps/api/geocode/json?latlng=25.2016428,55.2452567

gc04 <- as.numeric(geocode("Dubai International Airport"))
gc04

Source : https://maps.googleapis.com/maps/api/geocode/json?address=Dubai%20International%20Airport

formattable(data.frame(mapdist("dubai", "abu-dhabi")))

Source : https://maps.googleapis.com/maps/api/distancematrix/json?origins=dubai&destinations=abu-dhabi&mode=driving&language=en-EN

formattable(data.frame(mapdist("Jumerah", "Dubai International Airport")))

Source : https://maps.googleapis.com/maps/api/distancematrix/json?origins=Jumerah&destinations=Dubai%20International%20Airport&mode=driving&language=en-EN

geocode("Burj Khalifa", output = "more")

Source : https://maps.googleapis.com/maps/api/geocode/json?address=Burj%20Khalifa

geocode("Business Bay", output = "more")

Source : https://maps.googleapis.com/maps/api/geocode/json?address=Business%20Bay

var_ways <- route('Burj Khalifa', 'Business Bay', alternatives = TRUE)
formattable(head(data.frame(var_ways)))

Source : https://maps.googleapis.com/maps/api/directions/json?origin=Burj%20Khalifa&destination=Business%20Bay&mode=driving&units=metric&alternatives=true

options(repr.plot.width = 5, repr.plot.height = 5)
ggplot(data = var_ways) + geom_leg(aes(x = startLon, xend = endLon, y = startLat, yend = endLat, color = route)) + coord_map()

options(repr.plot.width = 10, repr.plot.height = 4)

qmap(location=c(55.2820, 25.1900), zoom = 15, maptype = 'roadmap', base_layer = ggplot(aes(x = startLon, y = startLat), data = var_ways)) +
geom_leg(aes(x = startLon, xend = endLon, y = startLat, yend = endLat, color = route), alpha = 0.5, size = 2, data = var_ways) +
labs(x = 'Longitude', y = 'Latitude', colour = 'Route') +
facet_wrap(~ route, ncol = 3) + theme(legend.position = 'top')

Source : https://maps.googleapis.com/maps/api/staticmap?center=25.19,55.282&zoom=15&size=640x640&scale=2&maptype=roadmap&language=en-EN

options(repr.plot.width = 10, repr.plot.height = 10)

way_map <- get_map(location = c(55.2820, 25.1900), source = "google", zoom = 15, maptype = "hybrid")
ggmap(way_map) + geom_leg(data = var_ways, aes(x = startLon, xend = endLon, y = startLat, yend = endLat, color = route), alpha = 0.7, size = 2)

Source : https://maps.googleapis.com/maps/api/staticmap?center=25.19,55.282&zoom=15&size=640x640&scale=2&maptype=hybrid&language=en-EN

1.2 Extract with osmar R¶

There are several ways to extract geodata. One of them is to do this with this R code cell.

This set of commands allows us to upload the data using the coordinates.

src <- osmsource_api()

smallbox <- center_bbox(55.2708, 25.2048, 1000, 1000)
sdubai <- get_osm(smallbox, source = src)

str(sdubai)

List of 3
 $ nodes    :List of 2
  ..$ attrs:'data.frame':	1625 obs. of  9 variables:
  .. ..$ id       : num [1:1625] 9.40e+07 1.12e+09 1.12e+09 1.12e+09 1.12e+09 ...
  .. ..$ visible  : Factor w/ 1 level "true": 1 1 1 1 1 1 1 1 1 1 ...
  .. ..$ timestamp: POSIXlt[1:1625], format: "2010-12-02 13:32:04" "2012-08-14 22:09:32" ...
  .. ..$ version  : num [1:1625] 5 2 1 2 2 2 2 2 2 3 ...
  .. ..$ changeset: num [1:1625] 6514690 12732672 7103923 12732672 12732672 ...
  .. ..$ user     : Factor w/ 48 levels "Alex111X","andi9876",..: 39 42 42 42 42 42 42 42 42 42 ...
  .. ..$ uid      : Factor w/ 48 levels "10927","114220",..: 1 15 15 15 15 15 15 15 15 15 ...
  .. ..$ lat      : num [1:1625] 25.2 25.2 25.2 25.2 25.2 ...
  .. ..$ lon      : num [1:1625] 55.3 55.3 55.3 55.3 55.3 ...
  ..$ tags :'data.frame':	232 obs. of  3 variables:
  .. ..$ id: num [1:232] 6.04e+08 6.04e+08 6.04e+08 6.04e+08 6.04e+08 ...
  .. ..$ k : Factor w/ 44 levels "addr:city","addr:housenumber",..: 29 19 34 44 18 28 15 30 15 30 ...
  .. ..$ v : Factor w/ 117 levels "+971 4 355 1116",..: 81 110 84 96 24 82 65 6 65 7 ...
  ..- attr(*, "class")= chr [1:3] "nodes" "osmar_element" "list"
 $ ways     :List of 3
  ..$ attrs:'data.frame':	226 obs. of  7 variables:
  .. ..$ id       : num [1:226] 1.07e+07 9.71e+07 1.56e+08 4.75e+07 4.75e+07 ...
  .. ..$ visible  : Factor w/ 1 level "true": 1 1 1 1 1 1 1 1 1 1 ...
  .. ..$ timestamp: POSIXlt[1:226], format: "2013-06-09 15:08:01" "2016-05-04 04:25:11" ...
  .. ..$ version  : num [1:226] 6 5 3 3 2 5 6 3 3 1 ...
  .. ..$ changeset: num [1:226] 16483397 39086578 12732672 35518461 35518461 ...
  .. ..$ user     : Factor w/ 31 levels "4b696d","Alex111X",..: 7 23 26 15 15 15 20 20 26 24 ...
  .. ..$ uid      : Factor w/ 31 levels "10927","111410",..: 11 7 9 27 27 27 14 14 9 1 ...
  ..$ tags :'data.frame':	651 obs. of  3 variables:
  .. ..$ id: num [1:651] 10710679 10710679 97120535 97120535 97120535 ...
  .. ..$ k : Factor w/ 51 levels "access","addr:city",..: 21 35 43 35 30 28 21 35 21 30 ...
  .. ..$ v : Factor w/ 101 levels "-1","+971 4 323 0000",..: 65 98 25 98 7 11 82 98 95 29 ...
  ..$ refs :'data.frame':	1917 obs. of  2 variables:
  .. ..$ id : num [1:1917] 10710679 10710679 10710679 97120535 97120535 ...
  .. ..$ ref: num [1:1917] 9.51e+07 9.51e+07 4.31e+08 1.13e+09 1.02e+09 ...
  ..- attr(*, "class")= chr [1:3] "ways" "osmar_element" "list"
 $ relations:List of 3
  ..$ attrs:'data.frame':	9 obs. of  7 variables:
  .. ..$ id       : num [1:9] 2183664 2183666 420297 6636849 6636851 ...
  .. ..$ visible  : Factor w/ 1 level "true": 1 1 1 1 1 1 1 1 1
  .. ..$ timestamp: POSIXlt[1:9], format: "2012-08-14 22:09:16" "2012-05-14 20:27:01" ...
  .. ..$ version  : num [1:9] 2 1 27 1 1 1 1 1 1
  .. ..$ changeset: num [1:9] 12732672 11599712 43692177 42753529 42753529 ...
  .. ..$ user     : Factor w/ 6 levels "4b696d","Alex111X",..: 5 5 6 1 1 4 3 3 2
  .. ..$ uid      : Factor w/ 6 levels "114220","1420318",..: 3 3 6 2 2 4 1 1 5
  ..$ tags :'data.frame':	25 obs. of  3 variables:
  .. ..$ id: num [1:25] 2183664 2183664 2183666 2183666 420297 ...
  .. ..$ k : Factor w/ 13 levels "building","by_night",..: 9 13 9 13 2 3 4 5 6 7 ...
  .. ..$ v : Factor w/ 16 levels "#CC0000","2",..: 11 13 11 13 9 1 3 12 5 6 ...
  ..$ refs :'data.frame':	53 obs. of  4 variables:
  .. ..$ id  : num [1:53] 2183664 2183664 2183664 2183666 2183666 ...
  .. ..$ type: Factor w/ 2 levels "node","way": 2 1 2 2 2 1 1 1 1 1 ...
  .. ..$ ref : num [1:53] 9.69e+07 9.27e+08 1.76e+08 1.56e+08 1.64e+08 ...
  .. ..$ role: Factor w/ 7 levels "","from","inner",..: 2 7 6 2 6 7 5 5 5 5 ...
  ..- attr(*, "class")= chr [1:3] "relations" "osmar_element" "list"
 - attr(*, "class")= chr [1:2] "osmar" "list"

bigbox <- center_bbox(55.2708, 25.2048, 6000, 6000)
bdubai <- get_osm(bigbox, source = src)

str(bdubai)

List of 3
 $ nodes    :List of 2
  ..$ attrs:'data.frame':	47837 obs. of  9 variables:
  .. ..$ id       : num [1:47837] 30593914 30593915 31473923 31474006 31474005 ...
  .. ..$ visible  : Factor w/ 1 level "true": 1 1 1 1 1 1 1 1 1 1 ...
  .. ..$ timestamp: POSIXlt[1:47837], format: "2016-08-19 09:40:14" "2010-12-14 12:40:14" ...
  .. ..$ version  : num [1:47837] 19 4 2 5 5 5 2 5 2 2 ...
  .. ..$ changeset: num [1:47837] 41552017 6657884 6514101 7313392 7313392 ...
  .. ..$ user     : Factor w/ 202 levels "08xavstj","12Katniss",..: 60 173 173 182 182 182 173 173 173 78 ...
  .. ..$ uid      : Factor w/ 202 levels "1069176","10927",..: 58 2 2 40 40 40 2 2 2 76 ...
  .. ..$ lat      : num [1:47837] 25.2 25.2 25.2 25.2 25.2 ...
  .. ..$ lon      : num [1:47837] 55.3 55.3 55.3 55.3 55.3 ...
  ..$ tags :'data.frame':	1957 obs. of  3 variables:
  .. ..$ id: num [1:1957] 9.11e+07 9.50e+07 9.50e+07 2.60e+08 2.81e+08 ...
  .. ..$ k : Factor w/ 103 levels "access","addr:city",..: 35 35 71 12 85 49 40 49 50 52 ...
  .. ..$ v : Factor w/ 753 levels "-1","+18006437560",..: 672 498 64 353 465 481 298 152 734 152 ...
  ..- attr(*, "class")= chr [1:3] "nodes" "osmar_element" "list"
 $ ways     :List of 3
  ..$ attrs:'data.frame':	6602 obs. of  7 variables:
  .. ..$ id       : num [1:6602] 4.86e+06 1.04e+08 1.04e+08 1.04e+08 1.06e+07 ...
  .. ..$ visible  : Factor w/ 1 level "true": 1 1 1 1 1 1 1 1 1 1 ...
  .. ..$ timestamp: POSIXlt[1:6602], format: "2014-05-05 11:47:38" "2011-03-12 17:37:11" ...
  .. ..$ version  : num [1:6602] 9 1 1 2 10 7 5 8 6 4 ...
  .. ..$ changeset: num [1:6602] 22145147 7535955 7535955 35985485 16483397 ...
  .. ..$ user     : Factor w/ 125 levels "12Katniss","13 digits",..: 30 112 112 6 29 112 29 79 29 105 ...
  .. ..$ uid      : Factor w/ 125 levels "1069176","10927",..: 123 31 31 65 39 31 39 40 39 2 ...
  ..$ tags :'data.frame':	10175 obs. of  3 variables:
  .. ..$ id: num [1:10175] 4.86e+06 4.86e+06 4.86e+06 1.04e+08 1.04e+08 ...
  .. ..$ k : Factor w/ 135 levels "access","access:note",..: 58 74 96 58 58 25 79 58 68 74 ...
  .. ..$ v : Factor w/ 866 levels "-1","-2","+971 4 323 0000",..: 635 257 826 715 715 826 810 634 1 26 ...
  ..$ refs :'data.frame':	56899 obs. of  2 variables:
  .. ..$ id : num [1:56899] 4.86e+06 4.86e+06 4.86e+06 4.86e+06 1.04e+08 ...
  .. ..$ ref: num [1:56899] 9.10e+07 2.84e+09 2.84e+09 9.10e+07 9.39e+07 ...
  ..- attr(*, "class")= chr [1:3] "ways" "osmar_element" "list"
 $ relations:List of 3
  ..$ attrs:'data.frame':	53 obs. of  7 variables:
  .. ..$ id       : num [1:53] 2757400 2757402 2757403 1320963 1320964 ...
  .. ..$ visible  : Factor w/ 1 level "true": 1 1 1 1 1 1 1 1 1 1 ...
  .. ..$ timestamp: POSIXlt[1:53], format: "2013-02-13 16:02:48" "2013-02-13 16:02:48" ...
  .. ..$ version  : num [1:53] 1 1 1 1 1 2 1 1 1 1 ...
  .. ..$ changeset: num [1:53] 15019545 15019545 15019545 6657884 6657884 ...
  .. ..$ user     : Factor w/ 16 levels "4b696d","Alex111X",..: 15 15 15 13 13 15 15 10 10 10 ...
  .. ..$ uid      : Factor w/ 16 levels "10927","114220",..: 5 5 5 1 1 5 5 6 6 6 ...
  ..$ tags :'data.frame':	288 obs. of  3 variables:
  .. ..$ id: num [1:288] 2757400 2757400 2757402 2757402 2757403 ...
  .. ..$ k : Factor w/ 184 levels "alt_name:af",..: 175 180 175 180 175 180 175 180 175 180 ...
  .. ..$ v : Factor w/ 167 levels "-1","#CC0000",..: 105 124 105 124 105 124 105 124 105 124 ...
  ..$ refs :'data.frame':	1526 obs. of  4 variables:
  .. ..$ id  : num [1:1526] 2757400 2757400 2757400 2757402 2757402 ...
  .. ..$ type: Factor w/ 2 levels "node","way": 2 2 1 2 2 1 2 2 1 2 ...
  .. ..$ ref : num [1:1526] 2.05e+08 2.05e+08 2.15e+09 2.05e+08 2.05e+08 ...
  .. ..$ role: Factor w/ 11 levels "","cable","from",..: 3 10 11 3 10 11 3 10 11 3 ...
  ..- attr(*, "class")= chr [1:3] "relations" "osmar_element" "list"
 - attr(*, "class")= chr [1:2] "osmar" "list"

node_tags <- sort(unique(bdubai$nodes$tags$k))
print(node_tags)

  [1] access                          addr:city                      
  [3] addr:country                    addr:flats                     
  [5] addr:housename                  addr:housenumber               
  [7] addr:place                      addr:postcode                  
  [9] addr:street                     addr:subdistrict               
 [11] aeroway                         amenity                        
 [13] barrier                         bench                          
 [15] bicycle                         building                       
 [17] capacity                        construction                   
 [19] contact:instagram               country                        
 [21] covered                         crossing                       
 [23] cuisine                         delivery                       
 [25] description                     diplomatic                     
 [27] direction                       drive_in                       
 [29] drive_through                   ele                            
 [31] email                           entrance                       
 [33] fee                             foot                           
 [35] highway                         horse                          
 [37] indoor_seating                  internet_access                
 [39] internet_access:fee             is_in                          
 [41] layer                           leisure                        
 [43] level                           levels                         
 [45] lit                             man_made                       
 [47] maxspeed                        motor_vehicle                  
 [49] name                            name:ar                        
 [51] name:de                         name:en                        
 [53] name:fr                         name:ko                        
 [55] name:pl                         name:ru                        
 [57] natural                         note                           
 [59] office                          opening_hours                  
 [61] operator                        outdoor_seating                
 [63] parking                         payment:bitcoin                
 [65] phone                           place                          
 [67] platforms                       power                          
 [69] public_transport                railway                        
 [71] ref                             religion                       
 [73] seamark:beacon_lateral:category seamark:beacon_lateral:colour  
 [75] seamark:beacon_lateral:system   seamark:information            
 [77] seamark:light:character         seamark:light:colour           
 [79] seamark:light:group             seamark:light:period           
 [81] seamark:light:reference         seamark:name                   
 [83] seamark:type                    shelter                        
 [85] shop                            shower                         
 [87] smoking                         source                         
 [89] sport                           station                        
 [91] subway                          supervised                     
 [93] surface                         surveillance                   
 [95] surveillance:type               surveillance:zone              
 [97] takeaway                        tourism                        
 [99] traffic_calming                 type                           
[101] website                         wheelchair                     
[103] wikipedia                      
103 Levels: access addr:city addr:country addr:flats ... wikipedia

way_tags <- sort(unique(bdubai$ways$tags$k))
print(way_tags)

  [1] access                   access:note              addr:city               
  [4] addr:country             addr:housename           addr:housenumber        
  [7] addr:postcode            addr:street              addr:suburb             
 [10] admin_level              aerialway                aeroway                 
 [13] alt_name                 alt_name:hu              alt_name2               
 [16] alt_old_name:hu          amenity                  area                    
 [19] atm                      barrier                  bicycle                 
 [22] boundary                 bridge                   bridge:structure        
 [25] building                 building:height          building:levels         
 [28] building:material        building:part            bus                     
 [31] cables                   capacity                 construction            
 [34] contact:email            contact:facebook         contact:fax             
 [37] contact:google_plus      contact:instagram        contact:phone           
 [40] contact:twitter          contact:website          covered                 
 [43] created_by               crossing                 cutting                 
 [46] description              destination              destination:lanes       
 [49] ele                      email                    escalator               
 [52] fee                      fence_type               foot                    
 [55] footway                  frequency                height                  
 [58] highway                  highway_1                horse                   
 [61] hotel                    indoor                   internet_access         
 [64] is_in                    junction                 landuse                 
 [67] lanes                    layer                    leisure                 
 [70] level                    lit                      loc_name                
 [73] man_made                 maxspeed                 maxspeed:hgv            
 [76] maxstay                  mooring                  motor_vehicle           
 [79] name                     name:ar                  name:en                 
 [82] name:et                  name:he                  name:hu                 
 [85] name:ko                  name:loc                 name:ru                 
 [88] name:sl                  name:uk                  name:zh                 
 [91] natural                  note                     office                  
 [94] old_name                 old_name:hu              oneway                  
 [97] opening_hours            operator                 park_ride               
[100] parking                  phone                    place                   
[103] power                    public_transport:version railway                 
[106] ref                      religion                 roof:material           
[109] roof:shape               room                     rooms                   
[112] service                  shop                     sloped_curb             
[115] smoking                  source                   sport                   
[118] stars                    start_date               surface                 
[121] tactile_paving           toll                     tourism                 
[124] tracktype                tunnel                   turn:lanes              
[127] voltage                  water                    waterway                
[130] website                  wheelchair               wheelchair:description  
[133] wikidata                 wikipedia                wires                   
135 Levels: access access:note addr:city addr:country ... wires

users <- unique(bdubai$nodes$attrs$user)
print(users)

  [1] FresRe                   Skywave                  Tommy                   
  [4] hno2                     Cali42                   bigbug21                
  [7] mkarau                   DerCut                   msghmr                  
 [10] greecemapper             rehan727                 GeoGrafiker             
 [13] GRagib                   Veit                     Rudy355                 
 [16] ratrun                   sunmarke                 13 digits               
 [19] Tiramon                  Daniel Damianov          mx18                    
 [22] tiger_old                lorenzo23622             eXmajor                 
 [25] vamros                   Jennings Anderson        OSMF Redaction Account  
 [28] Binu Soman Punalur       Кирилл Игоревич          Sharat Sreedharan Nair  
 [31] Maxoo60                  csdf                     Calibrator              
 [34] robgeb                   Otti38                   mawueth                 
 [37] SEVEN                    DrJohnM                  WingedStone             
 [40] humbach                  Alex111X                 Hilton Worldwide        
 [43] Seandebasti              Golovco Anatolie         Luis36995               
 [46] RoadGeek_MD99            Artur Wierniewski        zaizone                 
 [49] RR81                     RichRico                 cat_crash               
 [52] keepright! ler           uludur                   davemcmahon             
 [55] andi9876                 ipp1963                  VMukhtarov              
 [58] ediyes                   Ben                      matata                  
 [61] jphilipz                 Waldrenner               Oberaffe                
 [64] Rose DiCarlo             Muokkaaja                marek kleciak           
 [67] Lloyd De Jongh           maps4androo              kp61                    
 [70] malcolmh                 Kush                     Philippe Jantzem        
 [73] Dave Stanley             Hooman Mesgary           bahrain_bob             
 [76] wheelmap_visitor         Martin Usi               knaim                   
 [79] Dimitri_Junker           Supercarwaar             Axelode                 
 [82] LordOfMaps               kerhac                   Geo1der                 
 [85] Tomas Straupis           Majid Khan Munjai        alwasam6                
 [88] Gijsrooy                 Joe Daniels              Ravu al Hemio           
 [91] ConsEbt                  Psarras                  Craig78                 
 [94] Serpens                  Jaber Mohammad           Maximo Arieu            
 [97] landfahrer               faisalshah               Carlos Pera             
[100] Reinerkl                 Hiren Modi               Glendensu               
[103] SNIPatrick               tony quartararo          ahmed abdo edries nasur 
[106] bitigchi                 Pavel Melnikov           dannykath               
[109] nshah                    ika-chan!                acltpe                  
[112] dalmacapital             wohnung058               Hypersteff              
[115] barberodubai             Faisaljaleel             08xavstj                
[118] PavelPS                  prend2424                sprocketonline          
[121] anasmahdidi              Ahamed Zulfan            Jov Elizaga             
[124] yahya                    CAMcLaren                lt00380                 
[127] kisaa                    12Katniss                Tunisiano Bird          
[130] Khurram Shehzad          Abdulaziz AlSweda        rdz3056                 
[133] ‫جمعان الزهراني‬‎           samely                   Dinesh Correa           
[136] calfarome                dumpster-diver           Nearo                   
[139] Niels Elgaard Larsen     highflyer74              JJJSSS                  
[142] Akos Vancza              Muteboy                  GSark                   
[145] Mark Ben Nasis           Hawler Hawler            Joseph Jude             
[148] Harinaam                 metesacker               SomeoneElse_Revert      
[151] Pleabargain              Arabian aussie           kiwiwelshie             
[154] EvanSiroky               Hany Samuel              yourfriendjames         
[157] Tresind Restaurant       amanza                   HomoJin                 
[160] Anjam Tellicher          Brighter prep            flep                    
[163] karitotp                 Shahid Khan111           Raven Martirez De Guzman
[166] moonraj                  Scott Shepherd           rajender raj            
[169] Julian Parushev          Patel Shashikant         Umidjon Khashimov       
[172] Dara Adam khel           Rocia                    Youngsoo Hugh Kim       
[175] Hankoo                   mapmeld                  nilesh ambre            
[178] Amjad Roushdy            QWE1234                  muh saqi                
[181] Ahmed Arafa40            Mowrad Rownak            Vishal Johari           
[184] Sandor Sigler            Asghar Shinwari          Ro Sun                  
[187] Anil Reddy Ponnapati     ‫نبيل الغسيني‬‎             Mohammed Yousef         
[190] Ammad Ul Islam           i0d                      Ganesh Mannar           
[193] Hussain Alyousuf         Vildan                   Andre68                 
[196] FilO                     richard worl             BCNorwich               
[199] JAY725                   Kurashige                GiJo                    
[202] SimoneScharl            
202 Levels: 08xavstj 12Katniss 13 digits Кирилл Игоревич ... ‫نبيل الغسيني‬‎

1.3 Plotting with osmar R¶

plot(bdubai)

tss <- find(sdubai, node(tags(v == "traffic_signals")))
ts_sdubai <- subset(sdubai, node_ids = tss)

bss <- find(sdubai, node(tags(v %agrep% "busstop")))
bs_sdubai <- subset(sdubai, node_ids = bss)

hws <- find(sdubai, way(tags(k == "highway")))
hws <- find_down(sdubai, way(hws))
hw_sdubai <- subset(sdubai, ids = hws)

tus <- find(sdubai, way(tags(k == "tunnel")))
tus <- find_down(sdubai, way(tus))
tu_sdubai <- subset(sdubai, ids = tus)

plot_ways(hw_sdubai, col = "steelblue")
plot_ways(tu_sdubai, add = TRUE, col = "magenta")
plot_nodes(ts_sdubai, add = TRUE, col = "red")
plot_nodes(bs_sdubai, add = TRUE, col = "blue")

ts <- find(bdubai, node(tags(v == "traffic_signals")))
ts_dubai <- subset(bdubai, node_ids = ts)

bs <- find(bdubai, node(tags(v %agrep% "busstop")))
bs_dubai <- subset(bdubai, node_ids = bs)

hw <- find(bdubai, way(tags(k == "highway")))
hw <- find_down(bdubai, way(hw))
hw_dubai <- subset(bdubai, ids = hw)

tu <- find(bdubai, way(tags(k == "tunnel")))
tu <- find_down(bdubai, way(tu))
tu_dubai <- subset(bdubai, ids = tu)

plot_ways(hw_dubai, col = "steelblue")
plot_ways(tu_dubai, add = TRUE, col = "magenta")
plot_nodes(ts_dubai, add = TRUE, col = "red")
plot_nodes(bs_dubai, add = TRUE, col = "blue")

brewer.pal.info["Set3",]$maxcolors

bg <- find(bdubai, way(tags(k == "building")))
bg <- find_down(bdubai, way(bg))
bg_dubai <- subset(bdubai, ids = bg)
bg_poly <- as_sp(bg_dubai, "polygons")

spplot(bg_poly, col.regions=brewer.pal(12, "Set3"), c("version"))

# bus <- find(bdubai, relation(tags(v == "bus")))
# bus_dubai <- lapply(bus, function(i) { as_sp(get_osm(relation(i), full = TRUE), "lines") })

bs_points <- as_sp(bs_dubai, "points")

hw_line <- as_sp(hw_dubai, "lines")

plot(bg_poly, col = "lightsteelblue")
plot(hw_line, add = TRUE, col = "blue")
plot(bs_points, add = TRUE, col = "red")
# for ( i in seq(along = bus_dubai) ) { plot(bus[[i]], add = TRUE, col = "blue") }

1.4. Extract from OpenStreetMaps.org.¶

Another possible way is extracting data files in many different formats from the website: https://mapzen.com/data/metro-extracts/metro/dubai_abu-dhabi/ . The files dubai_abu-dhabi.osm, dubai_abu-dhabi_buildings.geojson, etc. were downloaded. The data from the format osm of the file were extracted in formats csv and json using specially designed functions in the programming language python.

Size of the downloaded osm, json and csv file.

OSM & JSON¶

file.size("/Users/olgabelitskaya/large-repo/dubai_abu-dhabi.osm")

file.size("/Users/olgabelitskaya/large-repo/dubai_abu-dhabi.osm.json")

CSV¶

file.size("/Users/olgabelitskaya/large-repo/nodes.csv")

file.size("/Users/olgabelitskaya/large-repo/nodes_tags.csv")

file.size("/Users/olgabelitskaya/large-repo/ways.csv")

file.size("/Users/olgabelitskaya/large-repo/ways_tags.csv")

file.size("/Users/olgabelitskaya/large-repo/ways_nodes.csv")

1.5 Osmar sources¶

source1 <- osmsource_file("dubai_abu-dhabi.osm")

# dubai1 <- get_osm(complete_file(), source=source1)

dubai2 <- osmar:::get_osm_data.osmfile(source1)

dubai2[5]

as_osmar(xmlParse(dubai2[5]))$nodes$attrs

get_osm(node(21133779), source = osmsource_api())$nodes$attrs

1.6 Osmplotr¶

dad_box <- get_bbox(c(55.2408, 25.1548, 55.2808, 25.2148))

dad_buildings <- extract_osm_objects(key='building', bbox=dad_box)

dad_buildings

class       : SpatialPolygonsDataFrame 
features    : 5 
extent      : 55.26595, 55.27728, 25.19427, 25.20051  (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs +towgs84=0,0,0 
variables   : 7
names       :        id, visible, timestamp, version, changeset, user, uid 
min values  :  65460387,      NA,        NA,      NA,        NA,   NA,  NA 
max values  : 126601137,      NA,        NA,      NA,        NA,   NA,  NA

dad_highways <- extract_osm_objects(key='highway', bbox=dad_box)

dad_map <- osm_basemap(bbox = dad_box, bg = 'lightgrey')

dad_map <- add_osm_objects(dad_map, dad_buildings, col = 'darkblue')

dad_map <- add_osm_objects(dad_map, dad_highways, col = 'steelblue')

dad_map

2. CSV & SQL¶

2.1. From osmar to csv files¶

# write.csv(dubai1$nodes$attrs, file = "rnodes.csv")
# file.size("rnodes.csv")

# write.csv(dubai1$nodes$tags, file = "rnodes_tags.csv")
# file.size("rnodes_tags.csv")

# write.csv(dubai1$ways$attrs, file = "rways.csv")
# file.size("rways.csv")

# write.csv(dubai1$ways$tags, file = "rways_tags.csv")
# file.size("rways_tags.csv")

# write.csv(dubai1$ways$refs, file = "rways_refs.csv")
# file.size("rways_refs.csv")

# write.csv(dubai1$relations$attrs, file = "rrelations.csv")
# file.size("rrelations.csv")

# write.csv(dubai1$relation$tags, file = "rrelations_tags.csv")
# file.size("rrelations_tags.csv")

# write.csv(dubai1$relation$refs, file = "rrelations_refs.csv")
# file.size("rrelations_refs.csv")

2.2. From csv files to SQL¶

The displayed lines of code represent the process of recording information of the CSV files to the SQL database.

Variant #1

sqlite <- dbDriver("SQLite")

dubai_abu_dhabi <- dbConnect(sqlite,"dubai_abu_dhabi.sqlite3")

nodes <- read.csv('nodes.csv')
nodes_tags <- read.csv('nodes_tags.csv')
ways <- read.csv('ways.csv')
ways_tags <- read.csv('ways_tags.csv')
ways_nodes <- read.csv('ways_nodes.csv')

# dbWriteTable(conn = dubai_abu_dhabi, name = 'nodes', value = nodes, row.names = FALSE)

# dbWriteTable(conn = dubai_abu_dhabi, name = 'nodes_tags', value = nodes_tags, row.names = FALSE)

# dbWriteTable(conn = dubai_abu_dhabi, name = 'ways', value = ways, row.names = FALSE)

# dbWriteTable(conn = dubai_abu_dhabi, name = 'ways_tags', value = ways_tags, row.names = FALSE)

# dbWriteTable(conn = dubai_abu_dhabi, name = 'ways_nodes', value = ways_nodes, row.names = FALSE)

dbListTables(dubai_abu_dhabi)

dbListFields(dubai_abu_dhabi, 'nodes')

Variant #2

# sqldf("attach dubai_abu_dhabi as new")

# read.csv.sql("nodes.csv", sql = "create table nodes as select * from file", dbname = "dubai_abu_dhabi")

sqldf("select * from nodes limit 3", dbname = "dubai_abu_dhabi")

# read.csv.sql("nodes_tags.csv", sql = "create table nodes_tags as select * from file", dbname = "dubai_abu_dhabi")

sqldf("select * from nodes_tags limit 3", dbname = "dubai_abu_dhabi")

# read.csv.sql("ways.csv", sql = "create table ways as select * from file", dbname = "dubai_abu_dhabi")

sqldf("select * from ways limit 3", dbname = "dubai_abu_dhabi")

# read.csv.sql("ways_tags.csv", sql = "create table ways_tags as select * from file", dbname = "dubai_abu_dhabi")

sqldf("select * from ways_tags limit 3", dbname = "dubai_abu_dhabi")

# read.csv.sql("ways_nodes.csv", sql = "create table ways_nodes as select * from file", dbname = "dubai_abu_dhabi")

sqldf("select * from ways_nodes limit 3", dbname = "dubai_abu_dhabi")

2.3 SQL quering¶

query001 = "SELECT COUNT(*) FROM nodes;"
query002 = "SELECT COUNT(*) FROM ways;"

The number of nodes:

sqldf(query001)

The number of ways:

sqldf(query002)

The number of users:

print(sqldf("SELECT COUNT(DISTINCT(e.uid)) FROM (SELECT uid FROM nodes UNION ALL SELECT uid FROM ways) e;"))

  COUNT(DISTINCT(e.uid))
1                   1885

The database allows to evaluate the contribution of each individual user in map editing.

Let us list the 3 most active editors of this map section:

formattable(sqldf("SELECT e.user, COUNT(*) as num \
             FROM (SELECT user FROM nodes UNION ALL SELECT user FROM ways) e \
             GROUP BY e.user \
             ORDER BY num DESC \
             LIMIT 3;"))

A list of the 3 most common types of places:

formattable(sqldf("SELECT value, COUNT(*) as num \
            FROM nodes_tags \
            WHERE key='place' \
            GROUP BY value \
            ORDER BY num DESC \
            LIMIT 3;"))

A list of the 10 most common types of buildings:

formattable(sqldf("SELECT value, COUNT(*) as num \
            FROM nodes_tags \
            WHERE key='building' \
            GROUP BY value \
            ORDER BY num DESC \
            LIMIT 10;"))

A list of the 10 most common facilities:

formattable(sqldf("SELECT value, COUNT(*) as num \
            FROM nodes_tags \
            WHERE key='amenity' \
            GROUP BY value \
            ORDER BY num DESC \
            LIMIT 10;"))

A list of the 20 most common streets:

formattable(sqldf("SELECT value, COUNT(*) as num \
            FROM nodes_tags \
            WHERE key='street' \
            GROUP BY value \
            ORDER BY num DESC \
            LIMIT 20;"))

# dbDisconnect(dubai_abu_dhabi)

3. JSON & Mongo DB¶

With very similar manipulations we can import the data from JSON files into MongoDB.

# Run mongod from terminal

Let's explore the dataset with the 'mongolite' package.

Variant #1

mg1 <- mongoDbConnect('test')

m1 <- mongo("openstreetmap_correct", verbose = FALSE)

stream_in(file("/Users/olgabelitskaya/large-repo/dubai_abu-dhabi_postcode.osm.json"), 
          handler = function(df){m1$insert(df)})

using a custom handler function.
opening file input connection.

 Found 2124505 records...

closing file input connection.

m1$count()

Variant #2

m <- mongo("openstreetmap", verbose = FALSE)

# stream_in(file("dubai_abu-dhabi.osm.json"), handler = function(df){m$insert(df)})

The number of documents:

m$count()

The three most active editors of this map section:

m$aggregate('[
    { "$group" : {"_id" : "$created.user", "count" : { "$sum" : 1} } }, 
    { "$sort" : {"count" : -1} }, { "$limit" : 3 } 
]')

The number of users with one note and the list of 10 users with only one note:

m$aggregate('[
    { "$group" : {"_id" : "$created.user", "count" : { "$sum" : 1} } },
    { "$group" : {"_id" : "$count", "num_users": { "$sum" : 1} } },
    { "$sort" : {"_id" : 1} }, { "$limit" : 1} 
]')

m$aggregate('[
    { "$group" : {"_id" : "$created.user", "count" : { "$sum" : 1} } }, 
    { "$sort" : {"count" : 1} }, { "$limit" : 10 } 
]')

The list of 3 most common places:

m$aggregate('[
    { "$match" : { "address.place" : { "$exists" : 1} } }, 
    { "$group" : { "_id" : "$address.place", "count" : { "$sum" : 1} } },  
    { "$sort" : { "count" : -1}}, {"$limit":3}
]')

The list of 10 most common types of buildings:

m$aggregate('[
    { "$match": { "building": { "$exists": 1}}}, 
    { "$group": { "_id": "$building", "count": { "$sum": 1}}}, 
    { "$sort": { "count": -1}}, {"$limit": 10}
]')

The list of 10 most common facilities:

m$aggregate('[
    { "$match": { "amenity": { "$exists": 1}}}, 
    { "$group": { "_id": "$amenity", "count": { "$sum": 1}}},
    { "$sort": { "count": -1}}, { "$limit": 10}
]')

The list of 3 most common zipcodes:

m$aggregate('[ 
    { "$match" : { "address.postcode" : { "$exists" : 1} } }, 
    { "$group" : { "_id" : "$address.postcode", "count" : { "$sum" : 1} } },  
    { "$sort" : { "count" : -1}}, {"$limit": 3}
]')

Counting zipcodes with one document:

m$aggregate(' [ 
    { "$group" : {"_id" : "$address.postcode", "count" : { "$sum" : 1} } },
    { "$group" : {"_id" : "$count", "count": { "$sum" : 1} } },
    { "$sort" : {"_id" : 1} }, { "$limit" : 1} 
]')

Some examples of statistics indicators for this dataset:

m$info()$stats$ns

m$info()$stats$size

m$info()$stats$avgObjSize

m$info()$stats$storageSize

4. Problems and errors¶

4.1¶

One of the main problems of public maps - no duplication of all place names in other languages. If it were possible to automate the translation process by increasing a common database of map names in many languages, it would save users from many difficulties and mistakes.

4.2¶

The next problem - the presence of a large number of databases (including mapping) on the same map objects. Some intergraph procedures of already available data would relieve a lot of people from unnecessary work, save time and effort.

4.3¶

Obviously, the information about the number of buildings and their purpose is incomplete. Completeness of public maps can be increased by bringing in the process of mapping new users. For this goal enter the information should be as simple as possible: for example, a choice of the available options with automatic filling many fields for linked options (for example, linking the name of the street and the administrative area in which it is located).

4.4¶

There are a number of mistakes and typos as in every public data. For correction them well-known methods can be proposed: automatic comparison with existing data and verification for new data by other users.

4.5¶

The lack of a uniform postal code system in this concrete dataset complicates their identification and verification.

4.6¶

During working on the project, I spent a lot of time on the conversion of one type of data file to another. Each format has its own advantages and disadvantages. Probably, it is possible to design a universal file type that allows us to store data of any kind, combining the advantages of all existing types and applicable in the most of existing programming languages.

4.7¶

Correction of errors made in the data seems to me appropriate to carry out after uploading files to the database. Sometimes a record that is a mistake in terms of filling a particular type of data just contains additional information about geoobjects.

5. Data Overview¶

5.1 Description of the data structure:¶

1) nodes - points in space with basic characteristics (lat, long, id, tags);

2) ways - defining linear features and area boundaries (an ordered list of nodes);

3) relations - tags and also an ordered list of nodes, ways and/or relations as members which is used to define logical or geographic relationships between other elements.

5.2 Indicators.¶

1) Size of the .osm file: 394,4 MB.

2) Size of the .osm sample file : 3,9 MB.

3) Nodes: 1890178.

4) Ways: 234327.

5) Relations: 2820.

6) Tags: 503027.

7) Users: 1895.

5.3 SQL & MongoDB¶

With the help of a specific set of commands we can perform a statistical description of the data collections and the databases.

6. Conclusion¶

I think this project is educational for me. I believe that one of the main tasks in this case was to study the methods of extraction and researching of map data in open access. For example, I used a systematic sample of elements from the original .osm file for trying functions of processing before applying them to the whole dataset. As a result I have some new useful skills in parsing, processing, storing, aggregating and applying the data.

In the research I have read through quite a lot of projects of other students on this topic. After my own research and review the results of other authors I have formed a definite opinion about the ideas in OpenStreetMap.

This website can be viewed as a testing ground of interaction of a large number of people (ncluding non-professionals) to create a unified information space. The prospects of such cooperation can not be overemphasized. The success of the project will allow to implement the ambitious plans in the field of available information technologies, the creation of virtual reality and many other areas.

Increasing of the number of users leads to many positive effects in this kind of projects:

1) a rapid improvement in the accuracy, completeness and timeliness of information;

2) approximation of the information space to the reality , the objectivity of the data evaluation;

3) reduce the effort for data cleansing on erroneous details.

Ideas for improving the project OpenStreetMap are simple and natural.

Increasing the number of users can be achieved by additional options like marks of the rating evaluation (eg, the best restaurant or the most convenient parking).

The popularity of the project may be more due to the temporary pop-up messages of users (placement is not more than 1-3 hours) with actual information about the geographic location (eg, the presence of traffic jams).

m	km	miles	seconds	minutes	hours	startLon	startLat	endLon	endLat	leg	route
30	0.030	0.0186420	41	0.6833333	0.011388889	55.27443	25.19762	55.27464	25.19781	1	A
79	0.079	0.0490906	9	0.1500000	0.002500000	55.27464	25.19781	55.27492	25.19818	2	A
142	0.142	0.0882388	20	0.3333333	0.005555556	55.27492	25.19818	55.27416	25.19921	3	A
371	0.371	0.2305394	71	1.1833333	0.019722222	55.27416	25.19921	55.27741	25.20055	4	A
1281	1.281	0.7960134	134	2.2333333	0.037222222	55.27741	25.20055	55.28841	25.19535	5	A
763	0.763	0.4741282	86	1.4333333	0.023888889	55.28841	25.19535	55.28298	25.19152	6	A

id	lat	lon	user	uid	version	changeset	timestamp
21133776	25.17528	55.39664	Tommy	18885	3	7291467	2011-02-15T02:24:49Z
21133779	25.14804	55.38621	Tommy	18885	2	7291467	2011-02-15T02:24:42Z
21133785	25.19667	55.30909	Tommy	18885	9	12645525	2012-08-07T13:29:47Z

id	node_id	position
4009554	90031463	0
4009554	90028252	1
4009554	21133804	2

value	num
village	608
locality	507
suburb	144

value	num
restaurant	1310
parking	596
fast_food	427
cafe	392
place_of_worship	362
pharmacy	310
bank	290
fuel	274
atm	216
bench	215

id	key	value	type
21136186	crossing	island	regular
21136186	highway	traffic_signals	regular
21161907	operator	Eppco	regular

id	user	uid	version	changeset	timestamp
4009554	rehan727	2952340	25	42505170	2016-09-28T21:02:31Z
4334711	4b696d	1420318	21	28096059	2015-01-12T19:49:12Z
4340534	wk2	1808544	18	18943947	2013-11-16T22:28:53Z

id	key	value	type
4009554	bridge	yes	regular
4009554	lanes	no\|no\|no\|no\|yes\|yes	hgv
4009554	highway	motorway	regular

value	num
yes	78
mosque	50
hut	41
residential	37
apartments	14
commercial	7
entrance	7
university	7
public	6
industrial	5

value	num
Al Taawun Street	48
Sheikh Zayed Road	23
Sheikh Mohammed bin Zayed Road	21
Al Ramth	20
Al Ettihad Road	14
King Faisal Street	13
Yas Leisure Drive	11
Paragon Mall, Reem Island	10
Al Fahidi (19th) Street	9
Sheikh Zayed The First Street	9
Corniche Road West	8
Hamdan Street	8
Yas Mall	8
King Abdul Aziz St.	7
10 B Street	6
Al Meena Street	6
Al Raffa Street	6
Al Thammam	6
Hazaa Bin Zayed The First Street	6
Jumeirah Beach Road	6

_id	count
ganesh reddy	1
Msmsms99	1
Rjensky	1
thajudeen n	1
aceman444	1
Haseeb1973	1
IňnôÇëňt Rähúl Päl	1
Khadar Mohaideen	1
Niyas Badarudeen	1
Emma Danny	1

_id	count
Yas Mall	14
Jumeirah Village Triangle	10
Deerfields Townsquare Shopping Centre	2

_id	count
yes	43834
house	4216
apartments	2910
residential	2606
roof	1026
hangar	825
warehouse	380
mosque	378
garage	314
commercial	313

_id	count
parking	5602
place_of_worship	1443
restaurant	1372
school	489
fast_food	442
fuel	438
cafe	403
bank	317
pharmacy	311
shelter	247

user	num
eXmajor	492808
chachafish	156874
Seandebasti	125767

_id	count
eXmajor	492808
chachafish	156874
Seandebasti	125767

_id	count
811	5
473828	4
24857	3