Quantcast
Channel: Alteryx – TAR Solutions
Viewing all 19 articles
Browse latest View live

Starting out with Alteryx

$
0
0

With all of the buzz around Alteryx currently I’ve decided to try and learn to use it in my spare time. In my day job the current project I’m working on doesn’t include any Tableau or Alteryx unfortunately so I will be doing this in my spare time (which is currently very limited…) and hopefully will find some nice ‘real world’ challenges on which to test it out.

Initial thoughts on Alteryx are that it should make combining multiple disparate data sources far quicker and easier than building out a SQL database as I currently would do. It’ll be interesting to see if this actually works in practice as in the real world some of the data sets can contain some nasty data that needs to be cleaned up. It also appears at first glance that some knowledge of how databases work, how to join data, etc, is required to make good use of the Join tools within Alteryx.

Also predictive analytics looks far simpler. I currently have a very basic knowledge of R and am hoping Alteryx will mean I don’t need to learn more R as I’ll be able to use Alteryx instead.

To begin I downloaded Alteryx Project Edition
.

Once installed when Alteryx is opened a pop up screen appears containing some basic tutorials (click on Open Tutorials) which I have run through as my basic training.

Alteryx Splash Screen

Next I plan to track down some real world cases where I could have used Alteryx and give it a better run out.

The post Starting out with Alteryx appeared first on Business Analytics.


Alteryx and Tableau to display UK schools data

$
0
0

As part of my Alteryx training, following on from my starting out with Alteryx, I decided to try and use it for a real world example to test it out. I recently had a baby so have to start thinking about schools. Luckily the UK government make school performance data public and the excellent Guardian Datablog have tidied it up for me (the data is for 2012, hence now out of date, but good enough for my training in Alteryx). The data contains school address details, local authority details, school size, school religion and multiple measures of performance.

I would like to know which schools are performing well – i.e. their pupils have high attainment in their exams – and then see which streets are within X miles to help guide me which streets I would need to live to get my son into a chosen school. Alteryx has an in built function called Trade Area where it can map show you a chose radius from a point, which is ideal for the schools approximated catchment area analysis.

Radius of 0.3 miles around each school

Radius of 0.3 miles around each school

Each purple dot is a school with the cirle around the dot being the approximate catchment area, although looking at this it appears the catchment area in London will, in reality, be less than 0.3 miles.

I began by downloading a couple of those cleaned up files from the Guardian data site, one containing the school performance data, another containing the relevant local authority details. I have a database background so could easily import all of the data into a database and join it up. Alternatively I could do it in Excel using some sort of lookup (vlookup, index & match, etc) although the volume of data may not make that practical. But the purpose of the exercise is to test out Alteryx so I didn’t want to prepare the data in advance, I wanted to try it all in Alteryx.

I began by importing the school data into Alteryx. This was from a csv so I used the Input tool, available in Favourites and In/Out on the toolbar, to import it. Alteryx Input Tool This was simple, the Input tool automatically configured the file for me. Next I imported the local authority data to give me the local authority name, this is to be used as a filter on the Tableau dashboard as I know which areas I prefer to live.

To cut a long story a little shorter, I joined the data in Alteryx, had a look at what I’d done using the Browse tool, available in Favourites and In/Out on the toolbar, and then realised I need to longitude and latitude of each school to use the Trade Area tool for getting the chosen radius from the school. In addition it would help to map the data in Tableau. Fortunately someone else has mapped every UK postcode to a longitude and latitude and kindly put that data in a csv on the internet for anyone to use. This meant I had to completely alter my Alteryx model to deal with this new data, adding extra joins, carrying out additional selects, all time consuming but good for my Alteryx training.

This is where my database background helps out as I understand how to join data. I expect Alteryx sell their product as usable by dummies but in reality I think a reasonable relational database understanding will help with modelling the data and knowledge of R will help with the predictive analytics side of Alteryx (which I hope to come to later in a future post).

The schools data came with UK postcode formatted as XXXX XXX, while the postcode to longitude and latitude mapping file has the postcode formatted as XXXXXXX. Therefore the datasets won’t join. Removing the space from the schools data postcode is the simple fix for this problem and Alteryx have a function to do that, the Formula tool, found in Favourties and Preparation on the toolbar. Alteryx Formula Tool Using the Formula tool I have many different expressions I can use to alter the underlying data. In this case I wanted to replace the empty spaces with a zero length string, which is done by using the ReplaceChar expression: ReplaceChar([School postcode], ” “, “”).

Alteryx Use Formula Tool

The next step is to use the Select tool, also found in Favourties and Preparation on the toolbar, to only use the data from the schools data which is useful to me. That file contains a lot of data and I prefer to narrow it down early in this case.

Alteryx Select Tool

Now the schools data is in a usable state I’m able to join the schools data to the UK postcode longitude/latitude data to bring the school longitude and latitude into the school performance dataset. To do this I use the Alteryx Join tool, Alteryx Join Tool found in the Join section on the toolbar, joining on postcode (from the postcode data) and School postcode (the formatted postcode from the schools data).

Alteryx  Join Postcode

Bringing in the Local Authority data is the next step. Once again a Join tool is used, this time joining the LA number in the Local Authority data to the LA number in the schools data. I then use another Select tool to pull the LA Name into the main dataset, along with the selected schools and longitude/latitude data pulled in previously.

All that remains is to get Alteryx to create the trade area to display the radius around the school. First step is to tell Alteryx which points to use to enable the trade area. In the Spatial section of the toolbar the function Create Points Alteryx Create Points Tool is used – this is what creates the longitude and latitude points that the Trade Area function can translate. Ensure the school longitude and latitude fields are used in the Create Points X and Y Field area.

Alteryx Create Points

This can then be joined to the Trade Area tool. Alteryx Trade Area Tool I want to create a radius around each school of 0.3 miles. To do this set a Specific Value of 0.3 for the Units of Radius (Miles).

Alteryx Trade Area

I now recommend using the Browse function and seeing the work using the Alteryx map view. Set the map to view using Maps Powered By CloudMade to view the data as in the image at the top of this post.

To finish off use the Output tool Alteryx Output Tool and create a Tableau Data Extract (*.tde) in a location of your choice.

Alteryx Output Tableau Data Extract

The final project in Alteryx is as follows, with the Tableau Data Extract created.

Alteryx Schools 2012 Project

Thank Alteryx for making that more straightfoward than any other method, open up Tableau and use the newly created Tableau Data Extract to visualise the data. The Tableau dashboard I quickly put together using the Alteryx output Tableau Data Extract is here.

The post Alteryx and Tableau to display UK schools data appeared first on Business Analytics.

2012 Primary Schools Performance in Tableau

Starting out with R in Alteryx

$
0
0

One of the great things that makes Alteryx so useful is the integration with R, making advanced statistical analysis possible for those without the ability to code in R. As part of my Alteryx training series the R integration is one of the areas I believe is key to learn.

If you are new to Alteryx and would like to try out using the R integration the first thing to do is install the R predictive tools.

Go to Help – About to work out which version of Alteryx you have installed and subsequently which of the R installers to install.

When starting out using Alteryx and the R functionality I suggest using the built in Alteryx R demo projects.

In File – Open Sample – Predictive Analytics there are a lot of projects already built using many of the R predictive analytics components. Work through those, get a basic understanding and then you can call yourself a data scientist :-)

Alteryx Predictive Analytics Samples

The post Starting out with R in Alteryx appeared first on Business Analytics.

Alteryx output to Tableau Data Extract tde file

$
0
0

To output data from Alteryx to Tableau is incredibly simple, Alteryx has the output File Format of a TDE file.

Alteryx Output Data Tool

Just select your data, drag in an Output tool and set the file type to be Tableau Data Extract (tde). Alternatively enter a .tde file name into the ‘Write to File or Database’ input box and it’ll default to a Tableau Data Extract File Format.

If you don’t have a tde file already created, Alteryx will create it for you – although this isn’t 100% clear. Type the name and location you want for your Tableau Data Extract in the ‘Write to File or Database’ location and Alteryx will create and put your tde file in that location.

Alteryx Tde Output Configuration

In the Output Options you can now append to an existing Tableau Data Extract file, meaning you don’t need to recreate the entire tde every time, particularly useful if you have incremental data. If not then go for the Overwrite Existing Extract File (Create if does not Exist) option.

The post Alteryx output to Tableau Data Extract tde file appeared first on Business Analytics.

Publish Tableau Data Extract directly from Alteryx

$
0
0

For those Alteryx and Tableau Server users the ability to publish Tableau Data Extract (tde) files directly from Alteryx to the Tableau Server is a huge benefit for report automation. There are a number of good posts already published advising how this should be done, such as this excellent guide from Interworks. In this post I’ll replicate some of what is said in the post, should that link break at some point, and also add to it where I ran into difficulties.

To publish to Tableau Server from Alteryx requires using Tableau’s tabcmd. I expect it’s also possible using the REST API from Tableau but I was unable to work it out, so this post will focus on the tabcmd solution, which, in my opinion, is far simpler. If you’re not a Tableau Server administrator in your role you may need to install tabcmd.

Do the following to set up a workflow to publish directly to Tableau Server from Alteryx, using the workflow configuration Events:

Step 1

Open the Alteryx workflow that creates the tde file.

AlteryxOutputToTde

Step 2

In the Events section of the Workflow – Configuration add a Run Command. Also ensure ‘Enable Events’ is checked.

AlteryxWorkflowEvents

Step 3

Choose “After Run Without Errors”

AlteryxEventRunCommand

Step 4

Enter the location of the tabcmd.exe in the Command box

AlteryxEventTabcmd

Step 5

Enter the command line to execute in the Command Arguments section. I would recommend entering this into Notepad first and copying into the Command Arguments section.

For some unknown reason when I typed directly into Alteryx or into MS Word, copying to Alteryx, when the command was passed into tabcmd by Alteryx, the quotation marks in the command string weren’t being recognised causing it to fail. Entering the command into Notepad and copying it into Alteryx helped me get around that problem.

AlteryxEventCommands

Example tabcmd command line:

publish “\\NetworkLocation\TableauDataToPublish.tde” -s TableauServer -u UserName -p Password -t TableauSite -o -r “Data Sources”

Important things to note about the string:

  • publish = the tabcmd being called
  • -s = tells tabcmd which Tableau Server to use. This doesn’t need to be the “https://TableauSite” URL, it is better as the server name as entered to the Tableau Postgres database
  • -u = the username used to login to the Tableau Server (also needs to be set up on the Tableau Server)
  • -p = the password used to login to the Tableau Server
  • -t = the name of the Tableau Site on the Tableau Server to use
  • -r = the project name to publish to. If this is not included the Default project is used.
  • -o = Overwrite the existing tde file with the same name

There are a number of things that can be specified in the tabcmd command string. The Tableau help documentation has a comprehensive list of those items.

Step 6

Ensure the Timeout (in seconds) is set appropriately to give enough time for the tde file to be published to the Tableau Server

 

Now it’s complete, run the workflow and you should no longer need to manually publish tde files.

Important notes:

  • If using the Alteryx server the Alteryx service account will need setting up on Tableau Server with a Publisher role.
  • If tabcmd fails use the tabcmd.log file to see the error message(s). Alteryx will only tell you it failed, the tabcmd.log will tell you why it has failed. This is copied from the tabcmd overview on the Tableau website advising where to find the tabcmd.log:Status messages and logsWhen a command is successful, tabcmd returns a status code of zero. A full error message for non-zero status codes is printed to stderr. In addition, informative or progress messages may be printed to stdout.A full log named tabcmd.log that includes debugging, progress, and error messages is written to C:\Users\<username>\AppData\Local\Tableau.

 

The post Publish Tableau Data Extract directly from Alteryx appeared first on Business Analytics.

Install Alteryx non admin version

Publish to Tableau Server from Alteryx using Run Command

$
0
0

One of the things I often read about on the Alteryx forums is people asking how to publish to Tableau Server from Alteryx. Previously I wrote a post about publishing to Tableau Server from Alteryx using the Events functionality. A downside of this is the timing of the publish to Tableau – it either happens before or after the workflow has run. To publish to Tableau server from within an Alteryx module the Run Command tool can be used, meaning publishing from Alteryx to Tableau Server can happen from anywhere within the worklow.

The Alteryx Run Command tool allows the running of any command line script during a workflow. Tableau’s tabcmd enables publishing to Tableau Server direct from the command line. Therefore in Alteryx we can write a command line script calling tabcmd and execute that script using the Run Command tool, meaning we can publish directly to Tableau Server from Alteryx in a workflow.

Follow these steps to for a simple example:

1. Add a Text Input tool and enter the filepath of the Tableau Data Extract file.

2. Add a Formula tool and create an Output Field called “cmd” with your script to run in the Command prompt. It’s recommended to create in Notepad initially and test it in the Command prompt to ensure it works as expected.

3. The structure of the Command prompt script should be:

a. TabCmd location

b. Command to run on TabCmd – ‘publish’ in this example

c. Tde file to publish to Tableau Server

d. Options/arguments required to publish using TabCmd. http://onlinehelp.tableau.com/current/server/en-us/tabcmd_cmd.htm

Alteryx Tabcmd Formula

4. Use a Select tool and select the Output Field just created. Only the Command script should be loaded into the Run Command.

Alteryx Tabcmd Select Cmd

5. Add a Run Command tool:

a. In the Write Source Output section identify where the batch file should be saved. In the examples I use the Alteryx temp directory. Set it to be a csv File Format, no delimiter, no field names and no quoted output fields

Alteryx Tabcmd Run Command Output

b. Call the bat file just created in the Run External Program – Command area

Alteryx Tabcmd Run Command

The final workflow to publish to Tableau Server from Alteryx is as follows:

Alteryx Publish To Tableau Workflow
This could easily become a macro changing the aspects of the Formula and/or Text Input tool to load any tde file.

The post Publish to Tableau Server from Alteryx using Run Command appeared first on Business Analytics.


Unzip a file in Alteryx

$
0
0

I question often seen on the Alteryx forums is how to unzip a file in Alteryx.

In an earlier post I covered how to use the Alteryx Run Command tool to publish a Tableau Data Extract to the Tableau Server in an Alteryx workflow using TabCmd.

This post is similar but is showing how to unzip using the Run Command. The Run Command tool is able to call exe programs, meaning anything possible by exe is possible within Alteryx.

Credit to this solution needs to be given to jdunkerley79 who supplied the solution in the Alteryx forum: http://community.alteryx.com/t5/Data-Preparation-Blending/Can-Alteryx-unzip-a-file-as-part-of-the-workflow/td-p/10604

The first thing to do is install software to unzip. Currently I’m using 7-zip (http://www.7-zip.org/). This can be called via the command prompt.

Next is how to use this in Alteryx:

1. Use a text input with the full Zip File Path to the file to unzip

Alteryx Unzip File Path
2. Use a formula tool to write the unzip command. The field can be called cmd. For example:

“C:\Program Files\7-Zip\7z.exe” e “[FilePath]” (with FilePath being the field in the text input tool containing the file path to the zipped file)

Alteryx Unzip Formula Tool

3. The unzipped file will be dropped to the directory from which it was called in the command line, hence this needs to be set before the unzip command is called. Using text inputs before and after the Select cmd field will enable us to set the directory within the command line. Remember in the command line a directory must be referred to via it’s mapped letter, not via the UNC \\server\directory Therefore we use pushd and popd to temporarily map directories.

Alteryx Unzip Command

This is what each line of the script is doing:

Alteryx Unzip Command Explanation

4. Notice all of the fields going to the Union tool are called ‘cmd’.

5. We have now created the script and it can be loaded into the Run Command tool. Loading a script into the Run Command tool automatically creates and runs a .bat file, which effectively is the script. This file just requires naming and subsequently calling in the Run Command tool…which actually is far simpler than I make it sound.

a. The union field is the only thing taken into the Run Command tool – i.e. only the script, no other fields.

b. The Write Source, prominently (and perhaps misleadingly…) called Output is where the script is ‘saved as’ a .bat file

c. In this example we save it to the Alteryx temp (%temp%) directory. Set it to be file type csv, first row doesn’t contain field names and Never Quote Output Fields

d. As we have now created the .bat file we now call it in the Run External Program Command section.

Alteryx Unzip Run CommandTool

 

e. Finally we created the file ziplist.txt in the script, containing the name of the file unzipped – we need to take this out of the Run Command tool to use in the workflow. To take a value out of the Run Command tool use the Read Results section and open the file in the Input section. This can also be a csv.

f. Note: as none of these Output and Input files exist while creating the Run Command you’ll be presented errors – these should go away when running the workflow.

6. The next step is to create and output the file path of the unzipped file. The formula tool is used to create the file path of this file. We put the unzipped file in the Alteryx engine temp directory in a new folder called Extract. The name of the file is taken from the ziplist.txt output from the Run Command tool. Therefore our formula is:

[Engine.TempFilePath] + ‘Extract\’ + [Field_1]

This is what the Alteryx engine directory looks like once the workflow has run, with the actual file inside the Extract folder:
Alteryx Unzip Directory

I created this as a macro, with the full workflow here.

Alteryx Unzip Macro

The post Unzip a file in Alteryx appeared first on Business Analytics.

Move or Copy files in Alteryx

$
0
0

In previous posts I’ve written about how to use the Run Command tool in Alteryx to publish a tableau extract to tableau server and how to unzip a file in Alteryx. Another great use for the Alteryx Run Command tool is to move and copy files.

To move files from a directory into another directory the following information is required:

  1. File path of file to move
  2. Destination directory

Set up the workflow in the following way:

1. Enter the above fields in a Text Input tool, creating the fields FullPath and DestinationDirectory

Alteryx Move Copy Text Input

Note: The backslash at the end of the DestinationDirectory is very important. Without that a new file would be created called Engine in D:\Alteryx.

2. Connect to a Formula tool to create the Move command. I call the field ‘cmd’. It’s recommended this is first written and tested in Notepad to ensure the command works as expected. (To copy a file edit this command expression.)

Alteryx Move Copy FormulaTool

3. Next step is to join this to a Select tool, only selecting the new ‘cmd’ field. The Run Command tool should only have one field loaded into the tool. In this case the ‘cmd’ field is loaded into the tool.

4. Add the Run Command tool to the workflow. The Run Command tool will take the command written in the field ‘cmd’ and convert it to a bat file. The Run Command tool will then call this bat file.

a. The Write Source Output is where the .bat file is created. This can be set as a .flat file or .csv –anything that’s a flat text input. (If using csv ensure there’s no delimiter set, the First Row does not Contain Field Names and Quote Output Fields Never.) In this example the file is being written to the Alteryx temp directory, %temp%. The file is called MoveFiles.bat.

b. In the Run External Program Command section we run the .bat file just created. Therefore just enter %temp%\MoveFiles.bat.

Alteryx Move Copy Run Command

When setting up the Run Command you could be presented with a few errors as the file being created doesn’t yet exist. Running the workflow should remove the errors.

The final workflow is as follows:

Alteryx Move Copy Workflow

Credit for the post also must go to s_pichaipillai on the Alteryx forum who explained how this works: http://community.alteryx.com/t5/Setup-Configuration/Configure-Run-Command-tool-to-move-a-file/td-p/11139

The same technique can be used to copy files, only replace the MOVE command with COPY or XCOPY.

The post Move or Copy files in Alteryx appeared first on Business Analytics.

Using Alteryx for Data Quality checks

$
0
0

Data Quality is now a rapidly growing area in many Financial Services organisations. There are multiple vendors, such as Informatica and Ab Initio, with software specifically marketed as a data quality tool. Undoubtedly they are both great products, however they are expensive, in all likelihood in the majority organisations would take significant time before they are approved for purchase.

This is where Alteryx can step in, it’s a great tool to very quickly implement a data quality solution. The price point will not deter the majority of FS organisations and the ongoing administration is not an expensive burden.

Most data quality checks are highly specific, hence are bespoke and unable to be standardised. A business rule against a specific data point in a specific data set is a unique check. For example in the financial services world there are a number of rules defining an ISIN. Depending on the country code (first 2 letters of the ISIN) the remainder of the ISIN could have a specific format and a relationship to other data points, such as a CUSIP.

For something as simple as an ISIN there are actually many specific business rules to identify whether it is correct. In the data quality world each business rule is another quality check. These quality checks all require writing to mirror the business rules.

From the technical perspective Data Quality is actually an ETL process.

  • Extraction: the source data needs to be sourced and brought into the quality check
  • Transformation: the business rule check, transforming the source data into a check result
  • Load: the capture of the results

In Alteryx terms a workflow can hold a number of checks against the same data set. A data set is Input to the workflow, the business rules are written using Formula tools and the results are Output.

Significantly accelerate the writing of quality checks by creating an appropriate Data Quality Check template workflow in Alteryx. The inputs and checks (formula tools) are easily modified and the outputs should be standardised for streamlined reporting of the results.

Using Alteryx the technical side and automation of Data Quality checking can be achieved very quickly. Get in touch if you would like to learn more about our tactical Data Quality solution.

The post Using Alteryx for Data Quality checks appeared first on Business Analytics.

Starting out with Alteryx

Alteryx and Tableau to display UK schools data

Starting out with R in Alteryx

Publish to Tableau Server from Alteryx using Run Command


Unzip a file in Alteryx

Move or Copy files in Alteryx

Using Alteryx for Data Quality checks

Tableau error measuring utf8 to utf16

Viewing all 19 articles
Browse latest View live