Rumi DataHub | pykih

Rumi DataHub

Rumi.io is a GarageBand for data so managers, journalists and open-data enthusiasts can collaboratively make sense of data themselves without developers.

We do this by providing drag-drop tools so normal people can pull, clean, model, mix, analyse and visualise any kind of data themselves.

Rumi DataHub

Rumi.io is a GarageBand for data so managers, journalists and open-data enthusiasts can collaboratively make sense of data themselves without developers.

We do this by providing drag-drop tools so normal people can pull, clean, model, mix, analyse and visualise any kind of data themselves.

Getting started

Users can upload any type of two-dimensional grid like data set. You can either create an empty spreadsheet or upload a CSV (comma separated values) of upto 15 MB. For larger data sets, you should use the “Upload via WEB Address (URL)” option.

At the heart of Rumi is a Spreadsheet view.

Test Caption

When you first upload the data, the data opens in spreadsheet view. On the right sidebar, auto detects columns in two categories — dimensions or metrics. Dimensions refers to some properties or aspects of the data. These are usually meant to classify and group the data. Metrics refer to parameters which can be measured and quantified. If our algorithm wrongly identifies a column as metric or dimension then you can drag a column from metric to dimension.

You can mark your data set as a dictionary by using the option ‘Mark it as dictionary’ which essentially means that in future if you want to use the same data set for data cleaning, you can do that easily. For example, if a dataset contains geographic locations with iso2 codes, it can be marked as dictionary and then, in a subsequent dataset, if the same geographic locations exit, the iso2 codes from the dictionary can be referenced to those locations.

Filter data

To filter data, select a column. Numeric and temporal data points are presented as a histogram. Select a region on the histogram and it filters rows in the spreadsheet.

Map view

Along with spreadsheet, watch your data on a map.

Filter rows in the table by selecting an area on the map. To do this, set filter radius.

Finally, click on cluster the data and it combines points on the maps and shows clusters of data.

Clean your data

Often data collected is not standardised and clean. This hampers any sort of analysis or visualisation. Out of the box, Rumi provides many of the cleaning functionalities e.g. like Facets, Distributions, Filter by data types, etc.

To clean a column, select the column.

Rumi currently allows you to assign 5 data types to your data: Text, Number, Decimal, Boolean (True/False) and Date. While assigning a Data Type to any metric, ensure that you are assigning the most appropriate type to the metric since it is possible to lose accuracy in the data on choosing the wrong data type. For example, changing data type from decimal to a number would make the data lose accuracy.

After assigning the data types to all the metrics you can even change the metrics to dimensions and vice versa by drag dropping the green tiles in the right sidebar into the Dimension or Metric section.

We also offer the ability to assign meta data to your columns. Currently, we support latitude and longitude as meta data. You can do this by using ‘Assign Meta’ option under ‘Format’.

Visually query your data

Drag-drop to conveniently and quickly create beautiful visualisations from a selection of 26+ charts and 109+ maps with 50+ features within minutes. Most importantly, when you update your data set, the change propagates to your visualisations too within minutes. Each of these charts have been carefully designed. Finally, you can customise almost every aspect of the chart. This includes colour, typography, labels, axis, etc.

At the bottom, you will notice that the dragged entity gets categorized into dimension or metric on its own and by default ‘sum’ function is applied onto the dimension which can be removed by clicking on the arrow which appears after hovering on it. Rumi offers you a set of pre-built functions you can apply on your metrics namely, sum, count, minimum, average and median.

Datacast will suggest you which charts can showcase your data sets in an efficient way and will disable those of which your data does not satisfy the requirements. After selecting the best chart you can save the visualization in your account.

Out of the box, Rumi comes with 11 themes. With just a click, you can theme your charts. Alternatively, we allow users to create custom themes for your organization account so your charts blend well into the look and feel of your chart.

To embed your visualisation inside of your blog or website or CMS, simply press the publish button. On pressing the publish button, the data related to the visualisation and the visualisation are published and you are provided with HTML code that you can use within your CMS. If you need to publish the visualisation in print or newspaper or magazine, then you can export the visualisation in SVG format. Your team can then import it into photoshop and print it. SVG is a vector file format so you can print it any resolution you want.

Open and available does not mean discoverable and usable

Open data portals today are like Yahoo’s directory listing (pre-Google era). One needs to visit multiple sources, download, analyse and often search again. If data is in PDFs, then it needs to manually converted.

Charities, Government and Open Data Enthusiasts can create public projects and publish data sets in it. Any one on the Internet can search for these datasets and “clone” them into their accounts.

We are also creating a library of unmodified data from authentic open data portals (e.g. data.gov.in, RBI.org.in, mospi.nic.in, planningcommission.nic.in, unicef.org/statistics, indiabudget.nic.in, ncrb.nic.in, mha.nic.in and dise.in.). You can simply search one site instead of searching on the Internet.

How can you use Rumi

Building block: Pykih is a services company. We love to custom design visualizations for our customers. Prior to Rumi, our projects were 3 FLOORS TALL since we spent up to 60% of the budget on non-core things e.g. back-ends, caching, hosting, querying data, etc. Rumi empowers us to reliably deliver projects that are 9 FLOORs TALL. Furthermore, Rumi was designed to increase the life-time value of the visualization by making it configurable.

Data as a service: Dashboards and visualizations are built on data. With Rumi, you can create a dataset and export them to the dashboards via APIs. Customers can log into Rumi, update their own data, press the PUBLISH button and three minutes later the visualization updates automatically. From a development stand-point, developers do not need to create tables, seed rows, build back-ends, cache reference tables, etc. Rumi handles all of this for you.

SQL-as-a-service: Data-as-a-service will allow developers to build static dashboards. But what about “interactive” visualizations? One needs the ability to query data to bring in interactivity. Rumi comes with a Javascript ORM called as PykQuery. Using PykQuery, your Javascript code can query in-browser or query the database directly from the front-end with minimum knowledge.

Configuration Editors: Often clients need various parts of the software to be configurable. This helps you keep your visualizations relevant to changing real-world scenarios. Rumi.io allows business users to dynamically define various configurable elements (e.g. titles, metatags, metadescriptions, etc. of pages) with a simple drag-and-drop UI. Customers can then change configurations from Rumi themselves, publish it and three minutes later the visualization updates.

Single-sign-on, reliable hosting for custom dashboards: Custom dashboards and visualizations are front-end only services projects. They are either hosted a) publicly for every one to consume e.g. for marketing or media b) or internally for business intelligence. Connect your GIT url to Rumi and Rumi hosts it for you. If the dashboard is hosted internally, then the launched is hosted within the session and security of Rumi.