Hot questions for Using Azure in python

Question:

I am working on java web application and I'm trying to use a simple AI algorithm - NLP to parse texts. I want to run a python script from my app NLP.py which uses a data from another file (3 Gb size) that resides on my local pc, I downloaded the python plugin and I run the script like this:

   String pythonScriptPath = "MY-PATH\\NLP\\NLP.py";
       String[] cmd = new String[3];
    cmd[0] = "python"; // check version of installed python: python -V
    cmd[1] = pythonScriptPath;
    cmd[2]="playing sport";
// create runtime to execute external command
    Runtime rt = Runtime.getRuntime();
    Process pr = rt.exec(cmd);

Files hierarchy:

Now I want to run all these things on Azure, I didn't find any relevant tutorial, I deployed the app as a regular web app but I still don't know:

  1. Where to upload the file that the script uses ?
  2. What path to write instead of MY-PATH ?
  3. How will the python script run on Azure, what resource should I use and how?
  4. Will it work like this (as web app that uses a python plugin) or should I do something entirely different?

Answer:

1.Where to upload the file that the script uses ?

I suggest you creating a new folder in your azure app project, such as D:\home\site\wwwroot\ProcessFile.

However, azure web app file system storage is limited by your app service. (You could check it on the portal) So, if your files are too large, you need to storage them into Azure Storage.

2.What path to write instead of MY-PATH ?

Just follow above absolute path D:\home\site\wwwroot\ProcessFile\NLP.py

3.How will the python script run on Azure, what resource should I use and how?

Per my knowledge, Azure Web App has its own Python environment, but you don't have permission to change it. Since you're using NLP which involves dependency packages, so I suggest you installing the Python Extension.

About details about steps , please follow the cases I answered before.

1.install odbc driver to azure app service

2.pyodbc on Azure

After installing your packages,you need to change the path parameters in your code.

String python= "D:\home\python362x86\python.exe";
String pythonScriptPath = "D:\home\site\wwwroot\ProcessFile\NLP.py";
String[] cmd = new String[3];
cmd[0] = "python"; // check version of installed python: python -V
cmd[1] = pythonScriptPath;
cmd[2]="playing sport";
// create runtime to execute external command
Runtime rt = Runtime.getRuntime();
Process pr = rt.exec(cmd);

Hope it helps you. Any concern ,please feel free to let me know.

Question:

I'm trying to create a function on Azure Function Apps that is given back a PDF and uses the python tika library to parse it. This setup works fine locally, and I have the python function set up in Azure as well, however I cannot figure out how to include Java in the environment?

At the moment, when I try to run the code on the server I get the error message

Unable to run java; is it installed? Failed to receive startup confirmation from startServer.


Answer:

So this isnt possible at this time. To solve it, I abstracted out the tika code into a Java Function app and used that instead.

Question:

I have an Azure Storage account with Data Lake Gen2. I would like to upload data from on-premise to the Lake Gen2 file systems using Python (or Java).

I have found examples on how to interact with File Shares in the Storage account, yet I could not yet find out how to upload to the Lake (instead of the File Share). I have also found out how to do it for Gen1 Lakes here, but nothing except closed requests for Gen2.

My question is whether this is even possible with Python as of today; alternatively, how can I upload files to the Gen2 Lake using Java? A code snippet demonstrating the API calls for the upload would be highly appreciated.


Answer:

According to the offical tutorial Quickstart: Upload, download, and list blobs with Python, as below, you can not directly use Azure Storage SDK for Python to do any operations in Azure Data Lake Store Gen 2 if you have not enrolled in the public preview of multi-protocol access on Data Lake Storage.

Note

The features described in this article are available to accounts that have a hierarchical namespace only if you enroll in the public preview of multi-protocol access on Data Lake Storage. To review limitations, see the known issues article.

So the only solution to upload data to ADLS Gen2 is to use the REST APIs of ADLS Gen2, please refer to its reference Azure Data Lake Store REST API.

Here is my sample code to upload data to ADLS Gen2 in Python, and it works fine.

import requests
import json

def auth(tenant_id, client_id, client_secret):
    print('auth')
    auth_headers = {
        "Content-Type": "application/x-www-form-urlencoded"
    }
    auth_body = {
        "client_id": client_id,
        "client_secret": client_secret,
        "scope" : "https://storage.azure.com/.default",
        "grant_type" : "client_credentials"
    }
    resp = requests.post(f"https://login.microsoftonline.com/{tenant_id}/oauth2/v2.0/token", headers=auth_headers, data=auth_body)
    return (resp.status_code, json.loads(resp.text))

def mkfs(account_name, fs_name, access_token):
    print('mkfs')
    fs_headers = {
        "Authorization": f"Bearer {access_token}"
    }
    resp = requests.put(f"https://{account_name}.dfs.core.windows.net/{fs_name}?resource=filesystem", headers=fs_headers)
    return (resp.status_code, resp.text)

def mkdir(account_name, fs_name, dir_name, access_token):
    print('mkdir')
    dir_headers = {
        "Authorization": f"Bearer {access_token}"
    }
    resp = requests.put(f"https://{account_name}.dfs.core.windows.net/{fs_name}/{dir_name}?resource=directory", headers=dir_headers)
    return (resp.status_code, resp.text)

def touch_file(account_name, fs_name, dir_name, file_name, access_token):
    print('touch_file')
    touch_file_headers = {
        "Authorization": f"Bearer {access_token}"
    }
    resp = requests.put(f"https://{account_name}.dfs.core.windows.net/{fs_name}/{dir_name}/{file_name}?resource=file", headers=touch_file_headers)
    return (resp.status_code, resp.text)

def append_file(account_name, fs_name, path, content, position, access_token):
    print('append_file')
    append_file_headers = {
        "Authorization": f"Bearer {access_token}",
        "Content-Type": "text/plain",
        "Content-Length": f"{len(content)}"
    }
    resp = requests.patch(f"https://{account_name}.dfs.core.windows.net/{fs_name}/{path}?action=append&position={position}", headers=append_file_headers, data=content)
    return (resp.status_code, resp.text)

def flush_file(account_name, fs_name, path, position, access_token):
    print('flush_file')
    flush_file_headers = {
        "Authorization": f"Bearer {access_token}"
    }
    resp = requests.patch(f"https://{account_name}.dfs.core.windows.net/{fs_name}/{path}?action=flush&position={position}", headers=flush_file_headers)
    return (resp.status_code, resp.text)

def mkfile(account_name, fs_name, dir_name, file_name, local_file_name, access_token):
    print('mkfile')
    status_code, result = touch_file(account_name, fs_name, dir_name, file_name, access_token)
    if status_code == 201:
        with open(local_file_name, 'rb') as local_file:
            path = f"{dir_name}/{file_name}"
            content = local_file.read()
            position = 0
            append_file(account_name, fs_name, path, content, position, access_token)
            position = len(content)
            flush_file(account_name, fs_name, path, position, access_token)
    else:
        print(result)


if __name__ == '__main__':
    tenant_id = '<your tenant id>'
    client_id = '<your client id>'
    client_secret = '<your client secret>'

    account_name = '<your adls account name>'
    fs_name = '<your filesystem name>'
    dir_name = '<your directory name>'
    file_name = '<your file name>'
    local_file_name = '<your local file name>'

    # Acquire an Access token
    auth_status_code, auth_result = auth(tenant_id, client_id, client_secret)
    access_token = auth_status_code == 200 and auth_result['access_token'] or ''
    print(access_token)

    # Create a filesystem
    mkfs_status_code, mkfs_result = mkfs(account_name, fs_name, access_token)
    print(mkfs_status_code, mkfs_result)

    # Create a directory
    mkdir_status_code, mkdir_result = mkdir(account_name, fs_name, dir_name, access_token)
    print(mkdir_status_code, mkdir_result)

    # Create a file from local file
    mkfile(account_name, fs_name, dir_name, file_name, local_file_name, access_token)

Hope it helps.

Question:

Is it possible to create an Azure VM AlertRule using either Java or Python? I do not see any documentation in the SDK for either language.

An acceptable answer shows code snippet in either language and does not suggest the usage of the Portal or Powershell.


Answer:

@bearrito, As far as I know, currently there is not SDK for Python and Java about to create Azure VM alert rule. But we could use Azure REST API to create this rule. Please refer to this document.https://msdn.microsoft.com/en-us/library/azure/dn510366.aspx

Meanwhile, I recommend you use the certificate to authenticate your application. Please refer to this page to create the certificate and upload to Azure Portal.https://msdn.microsoft.com/en-us/library/azure/ee460782.aspx#bk_cert Any update or questions, please let me know.