Artificial Intelligence : Masking Sensitive Data

I’ll step away from data analytics for a moment in this series of posts on GenAI to think about Data Masking. As I commented in Artificial Intelligence ? OpenAI ? Data Use and Privacy - there are provacy and security concerns when sending sensitive data to an external provider (in this case a GenAI provider) and where the data structure and volume is small enough one option is Masking. I’ll write another post about applying masking in data analytics - but to step through the process incrementally it’s instructive to think about masking within a single written document as it helps to identify what sort of masking is possible (and what isn’t) and how to apply it. ...

Artificial Intelligence : OpenAI - Analytics, Open Data and A Few Simple Prompts

As I mentioned in my Artificial Intelligence : OpenAI : Data Use and Privacy post, a key consideration when feeding data and dialogue to a third-party GenAI provider like OpenAI’s ChatGPT relates to data privacy. While investigating what you can do with to toolset, or simply if the data you need is available with an open license an excellent way to start is to use Open Data. It’s worth noting that just because data is found in a public location does not mean that it is Open Data. Before using the data in this way make sure to check the publishers license. A good source of open data is public government data - for example as published on sites like: ...

Artificial Intelligence - OpenAI - Data Use and Privacy

This begins a series of articles / code-snippets / thoughts regarding use of Generative Artificial Intelligence (GenAI) for data analytics - starting out primarily using OpenAI. This is not a broader discussion of uses of GenAI (though they’re kinda fun as well, and I’ll probably write about chat and API -style usage in the future). In particular I’m looking at OpenAI’s ChatGPT Plus which (as of the date of writing) is the paid subscription option of ChatGPT which allows access to additional functionality: ...