More

Adding values in two columns to new column on sync table in CARTO?

Adding values in two columns to new column on sync table in CARTO?


We have field data coming in from an app platform and we want to drive a CartoDB visualisation to help with planning.

A bunch of rangers go out surveying sites to count crocodiles. For each site, the app collects: - Number of Male crocodiles (num_croc_male) - Number of Female crocodiles (num_croc_female)

We don't actually want our people standing there in the dark doing sums, so they upload the data in this format straight to the cloud, and this is then POSTed over to our CartoDB.

In the CartoDB visualisation, we want to show the two cols above, and we also want: Total Number of Crocodiles (num_croc_sitetotal).

Since this integer will also be used for styling the display of each site, I want the value stored in a column.

And since the data is being refreshed after each patrol, I need this to be a stable transform that persists through new rows being added.

It seems like I can't find the tutorial or walk through for how this might be done… and my SQL is nearly non-existent. I was sort of hoping CartoDB had some sort of formula-building interface that would shield me from the full blast of SQL syntax.

Could anyone provide an outline of the solution steps I should be following here?


You can do that summation in the SQL statement behind the table in CartoDB. When you are in the Data View for the table, click on the SQL sidebar and you will see something like:

SELECT * FROM crocs_table

To include your summation, you would modify that statement to:

SELECT *, (num_croc_male+num_croc_female) as total_crocs FROM crocs_table

You need to click the "Apply query" button and then the table will refresh with the calculated column showing up. When you're in the map view, the calculated field is available for the purposes you describe.


If you had the 3 different columns in your CartoDB, you could update it with the SQL API.

So if you wanted your CartoDB table (we'll call it crocs_table) looked like this:

num_croc_male num_croc_female total_crocs Row1: 5 10 15

You could update it it using the SQL API like so:

http://{account}.cartodb.com/api/v2/sql?q={SQL statement}&api_key={Your API key}

Account = the name of your CartoDB account

Your API Key = The API key of your CartoDB account

SQL statement =

INSERT INTO croc_table (num_croc_male, num_croc_female, total_crocs) VALUES (5, 10, 15);

The male crocs have good odds here… The total URL would be something like this:

http://{account}.cartodb.com/api/v2/sql?q=INSERT INTO croc_table (num_croc_male, num_croc_female, total_crocs) VALUES (5, 10, 15)&api_key={Your API key}

Visiting the above URL would insert your data in CartoDB.

The key to how the total (15) is getting in the SQL statement is based on how you have your gator wranglers entering the data. If you're going to do it through a web form, you could use javascript to add the values together to create the URL and then redirect the user to that URL.


Don't use sum. follow the below syntaxselect (num_croc_male+num_croc_female) as total_crocs from


Allowing editing only certain fields in SharePoint list

I have a SharePoint list where users submit the data using a Microsoft form. After submission, I want to have mangers to approve and comment on the submission before it goes to a higher level of management and a PDF is generated with the approval and comments. Is there a way to allow manger to see the submission but only can change certain fields but not the original submission?!

P.S. if it’s not possible what would be the best way to do it?


Creating a slicer that filters multiple columns in Power BI

Power BI provides slicers for single columns, but there are scenarios where it could be useful to consolidate alternative filters for multiple columns into a single slicer. Technically, this is not possible in Power BI through the standard visualizations, but you can use a particular data modeling technique to obtain the desired result.

Consider the case of a Customer table with a geographical hierarchy with Continent, Country, and State. The requirement is to enable a filter over California (State), France (Country), and Asia (Continent) using a single slicer, as shown in the following screenshot.

The first step required in order to implement the solution is to create a calculated column in the Customer table that defines the granularity for the filter. Because the State value may be repeated, we create a calculated column in the Customer table by concatenating State and Country. The Continent column is not required because each Country belongs to only one Continent – in a different situation, consider concatenating all the columns that define a unique value for each row of the table:

The StateCountry column is needed to create a relationship with the Slicer table that shows the possible choices in a single item. Such a table has a Slicer[Selection] column with at least one value for each item displayed in the slicer, and all the combinations of StateCountry values to filter for each possible selection:

There has to be a relationship in the Slicer table obtained this way, between Slicer[StateCountry] and Customer[StateCountry]. This relationship has a many-many cardinality and a single filter direction, so that Slicer filters Customer (and not the other way around).

Now the model is ready. By selecting an item in the slicer, you apply a filter to the Slicer[Selection] column. This filter automatically propagates to the Customer[StateCountry] column thanks to the relationship created in the previous step. This is the model that supports the report shown at the beginning of this article.

Clearly, Slicer[Selection] could be used as a regular column in a visual. The following screenshot shows a matrix where the sales in Australia are included in two rows: Asia and Australia. Whenever you have a many-to-many cardinality relationship involved in a report, the measure is non-additive and the total shows a number that is not the sum of the visible rows – the total corresponds to the cumulative amount of the states selected considering each state only once.

This solution should work well for models where the cardinality of the relationship is of a few hundred unique values. If the Slicer[StateCountry] column has thousands of unique values or more, then it is better to consider an approach based on physical strong relationships relying on an intermediate table created with the unique values of Slicer[StateCountry]. The following technique is also available in Analysis Services versions that do not support the many-to-many cardinality in relationships. You can create the intermediate table StateCountry using the following calculated table definition:

By enabling the bidirectional filter between the Slicer and StateCountry tables – which should be hidden in report view – you get the same result of a many-to-many cardinality relationship with improved performance.

The technique described in this article is useful whenever you want to show in a single column the values that are natively present in different columns of a table. This provides a better user experience and a simplified user interface of the report.


8.4 Part B Online mapping

8.4.1 Learning objectives

By the end of this practical you should be able to:

  1. Describe and explain different methods for producing online maps
  2. Create interactive maps using RPubs, RMarkdown site generator and Shiny
  3. Critically appraise the appropriateness of mapping techniques based on the dataset and purpose of the output map

8.4.2 Introduction

In this practical we are going to preliminary stages of a mini-investigation. Since 2015 the law has capped short-term lets in London at 90 nights per year. However, this is very hard to enforce due to a lack of data and extensive evidence required to prove that the limit has been exceeded. This has been recently reflected in the Housing Research Note 2020/04: Short-term and holiday letting in London by the Greater London Authority (GLA):

“there are signs that short-term letting platforms are becoming increasingly commercialised and there are concerns that removing housing supply from the market to offer it for short-term letting could be exacerbating London’s housing shortage.”

The author, Georgie Cosh, was also kind enough to share some of the code used for this report. Guess what! They used R! Have a look at their code in the R file called GLA_airbnb_analysis in the prac8_data folder.

Whilst Air bnb have implemented a system the removes listings once they have been rented for 90 days per year unless an appropraite permit is in place we want to interactively visualise the the number of air bnb lettings (and hotels for comparison) per borough as a starting point. This could then be used to undertake further investigation into boroughs with high short term lets, for example exploring other websites to see if the properties are listed and jointly exceed 90 days or optimising localised monitoring. As these rules only apply to entire homes we will only extract only these, and for monitoring purposes (e.g. random annual inspections) those are availbale for 365 days of the year.

We will now explore several ways to do this…

The report by Cosh (2020) goes a bit further than this and implements an occupancy model (based on a number of assumptions) to estimate the number of nights a Air bnb is rented out for, so check it out, perhaps an idea for your final project.

8.4.3 RPubs

One of the most straight forward publishing tools is RPubs. It takes an .Rmd and directly uploads it to rpubs.com — all files are publically available on this website.

To start with you need to make a free account. Go to: https://rpubs.com/users/new and register

Create a new project in RStudio and open a new R Markdown file (File > New File > R Markdown)

You’ll see that the file is automatically populated with some information, have a read through it then click the Knit icon …

Let’s make some changes to your .Rmd . Delete all the text and code except from header information (that is enclosed by three dashes at the top of the file)

Insert a new code chunk (go back to RMarkdown if you need a refresher)…

…Add add some code of your choice from either a previous practical or your own work. As it’s a new project you’ll have to either copy the data into your project folder or set the working directory setwd() . If it’s all online data that you’ve loaded directly from the web into R, this shouldn’t be an issue. I’m going to use the interactive map we made in practical 5 (the Advanced interactive map section)…..Here is the code i’ve put in my chunk:

  1. Add some text at the start of your .Rmd you can include titles and subtitle using # followed by a space, a second level subtitle would be ##, and third ###

Save the file, Knitt it to HTML, this should be default and specified in the header — enclosed by three dashes.

Once knitted you can easily publish the file to Ppubs using the Publish icon either in the viewer pane or the toolbar area (by run)

Now how about adding a few design features…i’ve changed my header section to…

Knit and then publish again…you’ll notice a few aesthetic changes

To learn more about these go explore:

And for more code chunk control..

8.4.4 RMarkdown site generator

8.4.4.1 Set the file structure

RPubs are useful but what if you wanted to make a full site with different tabs for introduction, methodology, results and recommedations…one way is to use the RMarkdown site generator hosted on GitHub

RMarkdown site generator is useful as it does not require any third-party add ons like blogdown which is reliant on the hugo site generator

To make a site you’ll need the following to be within your project:

A configuration file with the filename _site.yml

Any other .Rmd files you want to create into pages on the site

For the site to work you only require (a) and (b)….but that would be a pretty boring site…

8.4.4.2 Link to GitHub

There are two ways to do this….

8.4.4.2.1 GitHub first

This is the ‘easy’ way as you woould repeat the steps in Practical 4 by firstly making a new repository on GitHub then loading a new project in RStudio, linking that to GitHub and copying all your files into your new project from the exisiting one.

8.4.4.2.2 GitHub last

So if you already have a RStudio project…like we do…we can link this to GitHub but the steps are a bit more invovled and there are several ways to acheive it — as with most things in R.

Make a Git repo in RStudio. Go to Tools > Project Options > Git/SVN and select Git under Version control system and initialize a new repository, then restart RStudio. The Git tab should appear..

Next we need to make a new repository on GitHub. Go to GitHub, login and make a new repository. Make sure that it is empty with no README.. you should have something similar to this appear:

Make one local commit. Under the Git tab > Diff > Stage the files > Add a commit message and click commit

Now we need to connect our local repository to the GitHub one. So Under the Git tab you’ll the new brach button (two purple boxes linked to a white box)…

  1. Click it > Add Remote. Paste in the URL use the remote name origin and the branch name of master — which you can get from the GitHub Quick setup screen after creating your repo. Check sync the branch with the remote > click create then select overwrite

Push the files to your GitHub and they will appear on your GitHub repo

Next we need to actually build the site…there are a few ways to do this…Go to the Git tab you should see the Build tab, if you can’t then go to Tools > Project Options > Build Tools and select website under Project build tools. Now click Build Website under the build tab

Alternatively you write the following in the console

If you wanted to just build a page from your site — say if you have made a rather big site with lots of analysis use:

  1. Stage, commit and then push the files to your GitHub. I had some issues staging the site_libs folder in the Git tab. I fixed it by closing and reloading my R project then clicking the cog symbol (under Git tab) > Shell and typing git add . If you get an error message about the index file being locked… go and delete it and try again. If you can’t delete restart the machine and try again. You will find it in the .git folder within your project. Once git add . runs you should see all the files staged, be able to commit and then push the changes to GitHub
  1. So your ‘built’ website is up on GitHub, but you need to tell it where to build the site from…Go to your GitHub repo > Settings, scroll down to GitHub pages and select the Source as the master branch

  1. Click on the link that is provided where your site is published and you should have a website with two tabs. Here is what mine looks like:

For more information on hosting your code from RStudio on GitHub check out the book Happy Git and GitHub for the useR

My RMarkdown site can be found at this GitHub pages link, but note that i’ve added a Shiny tab…which is covered in an optional extra.

At this stage, it’s important to highlight the need for sensible representations of data and analysis. This especially true with the advent of GitHub and journals publishing data and code meaning it is almost impossible to present false representations of analysis.

8.4.5 Warning

Whilst it might seem tempting to use the most advanced interactive maps for your the assignments within this module and course it is important to think and reflect upon the appropriateness of the mapped output. You should ask yourself the following questions:

  • What am i trying to show with this map
  • Will interative elements aid this in anyway or just confuse users
  • Is there a more concise way to display this data
  • Do i need all this information — is it all relevant to the message you are trying to portray with the map

In all these examples i’ve used the same data (Hotels and Airbnbs in London boroughs), however as i’m only showing two datasets could this be represeted without all this complexity?— in this case the answer really depends on the audience you are tyring to get something across to. For example, for use in an academic journal, as there are only two data sets being mapped a static map like we produced in the Map making practical would be more appropraite. However an interative map (similar to what we have produced) might be more useful for incorpation on a website…

The take home message from this is to critically think about the best way to map and disseminate your data/results.

8.4.6 Advanced online publishing

Already familiar with RPubs, RMarkdown site generator or even Shiny? Try and and produce an online document using either: bookdown, flexdashboard, blogdown or shinydashboard. These are listed in order of difficutly from my experience.

Hint this document is made using bookdown, so checkout my GitHub for how i set it up

If you want to extend bookdown or automate the process of building, have a look at continuous integration using github actions


16.3 Recommended Data Masking Workflow

Figure 16-1 shows that the production database is cloned to a staging region and then masked there. During the masking process, the staging and test areas are tightly controlled like a production site.

Figure 16-1 Data Masking Workflow

Data masking is an iterative and evolving process handled by the security administrator and implemented by the database administrator. When you first configure data masking, try out the masking definition on a test system, then add a greater number of columns to the masking definition and test it to make sure it functions correctly and does not break any application constraints. During this process, you should exercise care when removing all imbedded references to the real data while maintaining referential integrity.

After data masking is configured to your satisfaction, you can use the existing definition to repeatedly mask after cloning. The masking definition, however, would need to evolve as new schema changes require new data and columns to be masked.

After the masking process is complete, you can distribute the database for wide availability. If you need to ship the database to another third-party site, you are required to use the Data Pump Export utility, and then ship the dump file to the remote site. However, if you are retaining the masked data in-house, see "Data Masking Task Sequence".


Deleting columns from a table's schema definition

This product or feature is covered by the Pre-GA Offerings Terms of the Google Cloud Platform Terms of Service. Pre-GA products and features may have limited support, and changes to pre-GA products and features may not be compatible with other pre-GA versions. For more information, see the launch stage descriptions.

You can delete columns from an existing table's schema definition manually.

Manually delete a column

You can delete a column from an existing table by using the ALTER TABLE DROP COLUMN data definition language (DDL) statement.


Create a notes field for messages

To create a field so you can add notes to the messages, you need to create a Text field and enable in-cell editing. Use the same steps as above, choosing the Text field type instead.

The following video tutorial shows how to create a custom field in Outlook 2010 and newer.


Create sync group

Go to the Azure portal to find your database in SQL Database. Search for and select SQL databases.

Select the database you want to use as the hub database for Data Sync.

The hub database is a sync topology's central endpoint, in which a sync group has multiple database endpoints. All other member databases with endpoints in the sync group, sync with the hub database.

On the SQL database menu for the selected database, select Sync to other databases.

On the Sync to other databases page, select New Sync Group. The New sync group page opens with Create sync group.

On the Create Data Sync Group page, change the following settings:

If you choose New database, select Create new database. Then on the SQL Database page, name and configure the new database and select OK.

Hub win means when conflicts occur, data in the hub database overwrites conflicting data in the member database.

Microsoft recommends to create a new, empty database for use as the Sync Metadata Database. Data Sync creates tables in this database and runs a frequent workload. This database is shared as the Sync Metadata Database for all sync groups in a selected region and subscription. You can't change the database or its name without removing all sync groups and sync agents in the region. Additionally, an Elastic jobs database cannot be used as the SQL Data Sync Metadata database and vice versa.

Select OK and wait for the sync group to be created and deployed.

On the New Sync Group page, if you selected Use private link, you will need to approve the private endpoint connection. The link in the info message will take you to the private endpoint connections experience where you can approve the connection.

The private links for the syng group and the sync members neet to be created, approved, and disabled separately.


4 Answers 4

There are two jobs, User Profile to SharePoint Quick Sync and User Profile to SharePoint Full Sync, that synchronize the User Profile database information with the UIL. Sometimes this stops working (properly) and in that case you need to run:

The first command will list Content Databases that haven't had the UPA -> UIL sync occur in 0 or more days. The second command will delete the records corresponding to those databases (it doesn't delete databases/end user data).