Skip to content

Commit

Permalink
Use data_utils
Browse files Browse the repository at this point in the history
  • Loading branch information
pamelafox committed Jun 8, 2023
1 parent c84552c commit 5bc6895
Show file tree
Hide file tree
Showing 26 changed files with 1,925 additions and 1,042 deletions.
8 changes: 8 additions & 0 deletions .devcontainer/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
ARG VARIANT=bullseye
FROM mcr.microsoft.com/vscode/devcontainers/base:0-${VARIANT}
RUN export DEBIAN_FRONTEND=noninteractive \
&& apt-get update && apt-get install -y xdg-utils \
&& apt-get clean -y && rm -rf /var/lib/apt/lists/*
RUN mkdir -p /opt/microsoft/azd \
&& curl -L https://github.com/Azure/azure-dev/releases/download/azure-dev-cli_0.9.0-beta.2/azd-linux-arm64-beta.tar.gz | tar zxvf - -C /opt/microsoft/azd \
&& ln -s /opt/microsoft/azd/azd-linux-arm64 /usr/local/bin/azd
33 changes: 33 additions & 0 deletions .devcontainer/devcontainer.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
{
"name": "Azure Developer CLI",
"build": {
"dockerfile": "Dockerfile",
"args": {
"VARIANT": "bullseye"
}
},
"features": {
"ghcr.io/devcontainers/features/node:1": {
"version": "16",
"nodeGypDependencies": false
},
"ghcr.io/devcontainers/features/azure-cli:1.0.8": {}
},
"customizations": {
"vscode": {
"extensions": [
"ms-azuretools.azure-dev",
"ms-azuretools.vscode-bicep",
"ms-python.python"
]
}
},
"forwardPorts": [
5000
],
"postCreateCommand": "",
"remoteUser": "vscode",
"hostRequirements": {
"memory": "8gb"
}
}
4 changes: 3 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
.venv
frontend/node_modules
.env
static
static
.azure/
__pycache__/
70 changes: 70 additions & 0 deletions README_azd.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
# (Preview) Sample Chat App with AOAI

## Deploying with the Azure Developer CLI

> **IMPORTANT:** In order to deploy and run this example, you'll need an **Azure subscription with access enabled for the Azure OpenAI service**. You can request access [here](https://aka.ms/oaiapply). You can also visit [here](https://azure.microsoft.com/free/cognitive-search/) to get some free Azure credits to get you started.
> **AZURE RESOURCE COSTS** by default this sample will create Azure App Service and Azure Cognitive Search resources that have a monthly cost, as well as Form Recognizer resource that has cost per document page. You can switch them to free versions of each of them if you want to avoid this cost by changing the parameters file under the infra folder (though there are some limits to consider; for example, you can have up to 1 free Cognitive Search resource per subscription, and the free Form Recognizer resource only analyzes the first 2 pages of each document.)
### Prerequisites

If you open this project in GitHub Codespaces or a local Dev Container, these will be available in the environment.
Otherwise, you need to install them locally.

- [Azure Developer CLI](https://aka.ms/azure-dev/install)
- [Python 3+](https://www.python.org/downloads/)
- **Important**: Python and the pip package manager must be in the path in Windows for the setup scripts to work.
- [Node.js](https://nodejs.org/en/download/)
- [Git](https://git-scm.com/downloads)
- [Powershell 7+ (pwsh)](https://github.com/powershell/powershell) - For Windows users only.
- **Important**: Ensure you can run `pwsh.exe` from a PowerShell command. If this fails, you likely need to upgrade PowerShell.

>NOTE: Your Azure Account must have `Microsoft.Authorization/roleAssignments/write` permissions, such as [User Access Administrator](https://learn.microsoft.com/azure/role-based-access-control/built-in-roles#user-access-administrator) or [Owner](https://learn.microsoft.com/azure/role-based-access-control/built-in-roles#owner).
### Starting from scratch:

If you don't have any pre-existing Azure services (i.e. OpenAI or Cognitive Search service), then you can provision
all resources from scratch by following these steps:

1. Run `azd up` - This will provision Azure resources and deploy this sample to those resources, including building the search index based on the files found in the `./data` folder.
1. After the application has been successfully deployed you will see a URL printed to the console. Click that URL to interact with the application in your browser.
> NOTE: It may take a minute for the application to be fully deployed. If you see a "Python Developer" welcome screen, then wait a minute and refresh the page.
### Use existing resources:

If you have existing Azure resources that you want to reuse, then you must first set `azd` environment variables _before_ running `azd up`.

Run the following commands based on what you want to customize:

* `azd env set AZURE_OPENAI_RESOURCE {Name of existing OpenAI service}`
* `azd env set AZURE_OPENAI_RESOURCE_GROUP {Name of existing resource group that OpenAI service is provisioned to}`
* `azd env set AZURE_OPENAI_SKU_NAME {Name of OpenAI SKU}`. Defaults to 'S0'.
* `azd env set AZURE_SEARCH_SERVICE {Name of existing Cognitive Search service}`
* `azd env set AZURE_SEARCH_SERVICE_RESOURCE_GROUP {Name of existing resource group that Cognitive Search service is provisioned to}`
* `azd env set AZURE_SEARCH_SKU_NAME {Name of Cognitive Search SKY}`. Defaults to 'standard'.
* `azd env set AZURE_STORAGE_ACCOUNT {Name of existing Storage account}`. Used by prepdocs.py for uploading docs.
* `azd env set AZURE_STORAGE_ACCOUNT_RESOURCE_GROUP {Name of existing resource group that Storage account is provisioned to}`.
* `azd env set AZURE_FORMRECOGNIZER_SERVICE {Name of existing Form Recognizer service}`. Used by prepdocs.py for text extraction from docs.
* `azd env set AZURE_FORMRECOGNIZER_SERVICE_RESOURCE_GROUP {Name of existing resource group that Form Recognizer service is provisioned to}`.
* `azd env set AZURE_FORMRECOGNIZER_SKU_NAME {Name of Form Recognizer SKU}`. Defaults to 'S0'.


1. Run `azd up`. This will provision any missing Azure resources and deploy this sample to those resources, including building the search index based on the files found in the `./data` folder.
1. After the application has been successfully deployed you will see a URL printed to the console. Click that URL to interact with the application in your browser.
> NOTE: It may take a minute for the application to be fully deployed. If you see a "Python Developer" welcome screen, then wait a minute and refresh the page.

### Re-deploying changes

If you make any changes to the app code (JS or Python), you can re-deploy the app code to App Service by running the `azd deploy` command.

If you change any of the Bicep files in the infra folder, then you should re-run `azd up` to both provision resources and deploy code.

### Running locally:

1. Run `azd auth login`
3. Run `./start.cmd` or `./start.sh` to start the project locally.

### Note

>Note: The PDF documents used in this demo contain information generated using a language model (Azure OpenAI Service). The information contained in these documents is only for demonstration purposes and does not reflect the opinions or beliefs of Microsoft. Microsoft makes no representations or warranties of any kind, express or implied, about the completeness, accuracy, reliability, suitability or availability with respect to the information contained in this document. All rights reserved to Microsoft.
10 changes: 7 additions & 3 deletions app.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,9 @@

app = Flask(__name__)

# setup basic logging
logging.basicConfig(level=logging.INFO)

@app.route("/", defaults={"path": "index.html"})
@app.route("/<path:path>")
def static_file(path):
Expand All @@ -26,6 +29,7 @@ def static_file(path):
AZURE_SEARCH_CONTENT_COLUMNS = os.environ.get("AZURE_SEARCH_CONTENT_COLUMNS")
AZURE_SEARCH_FILENAME_COLUMN = os.environ.get("AZURE_SEARCH_FILENAME_COLUMN")
AZURE_SEARCH_TITLE_COLUMN = os.environ.get("AZURE_SEARCH_TITLE_COLUMN")
print('title', AZURE_SEARCH_TITLE_COLUMN)
AZURE_SEARCH_URL_COLUMN = os.environ.get("AZURE_SEARCH_URL_COLUMN")

# AOAI Integration Settings
Expand All @@ -44,7 +48,7 @@ def static_file(path):
SHOULD_STREAM = True if AZURE_OPENAI_STREAM.lower() == "true" else False

def is_chat_model():
if 'gpt-4' in AZURE_OPENAI_MODEL_NAME.lower():
if 'gpt-4' in AZURE_OPENAI_MODEL_NAME.lower() or 'gpt-35' in AZURE_OPENAI_MODEL_NAME.lower():
return True
return False

Expand Down Expand Up @@ -85,7 +89,7 @@ def prepare_body_headers_with_data(request):
}
]
}

app.logger.info(body)
chatgpt_url = f"https://{AZURE_OPENAI_RESOURCE}.openai.azure.com/openai/deployments/{AZURE_OPENAI_MODEL}"
if is_chat_model():
chatgpt_url += "/chat/completions?api-version=2023-03-15-preview"
Expand Down Expand Up @@ -138,7 +142,7 @@ def stream_with_data(body, headers, endpoint):
deltaText = lineJson["choices"][0]["messages"][0]["delta"]["content"]
if deltaText != "[DONE]":
response["choices"][0]["messages"][1]["content"] += deltaText

app.logger.info(response)
yield json.dumps(response).replace("\n", "\\n") + "\n"
except Exception as e:
yield json.dumps({"error": str(e)}).replace("\n", "\\n") + "\n"
Expand Down
34 changes: 34 additions & 0 deletions azure.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
# yaml-language-server: $schema=https://raw.githubusercontent.com/Azure/azure-dev/main/schemas/v1.0/azure.yaml.json

name: sample-app-aoai-chatgpt
metadata:
template: [email protected]
services:
backend:
project: .
language: py
host: appservice
hooks:
prepackage:
windows:
shell: pwsh
run: cd ./frontend;npm install;npm run build
interactive: true
continueOnError: false
posix:
shell: sh
run: cd ./frontend;npm install;npm run build
interactive: true
continueOnError: false
hooks:
postprovision:
windows:
shell: pwsh
run: $output = azd env get-values; Add-Content -Path .env -Value $output;
interactive: true
continueOnError: false
posix:
shell: sh
run: azd env get-values > .env
interactive: true
continueOnError: false
Binary file added data/employee_handbook.pdf
Binary file not shown.
135 changes: 135 additions & 0 deletions infra/abbreviations.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,135 @@
{
"analysisServicesServers": "as",
"apiManagementService": "apim-",
"appConfigurationConfigurationStores": "appcs-",
"appManagedEnvironments": "cae-",
"appContainerApps": "ca-",
"authorizationPolicyDefinitions": "policy-",
"automationAutomationAccounts": "aa-",
"blueprintBlueprints": "bp-",
"blueprintBlueprintsArtifacts": "bpa-",
"cacheRedis": "redis-",
"cdnProfiles": "cdnp-",
"cdnProfilesEndpoints": "cdne-",
"cognitiveServicesAccounts": "cog-",
"cognitiveServicesFormRecognizer": "cog-fr-",
"cognitiveServicesTextAnalytics": "cog-ta-",
"computeAvailabilitySets": "avail-",
"computeCloudServices": "cld-",
"computeDiskEncryptionSets": "des",
"computeDisks": "disk",
"computeDisksOs": "osdisk",
"computeGalleries": "gal",
"computeSnapshots": "snap-",
"computeVirtualMachines": "vm",
"computeVirtualMachineScaleSets": "vmss-",
"containerInstanceContainerGroups": "ci",
"containerRegistryRegistries": "cr",
"containerServiceManagedClusters": "aks-",
"databricksWorkspaces": "dbw-",
"dataFactoryFactories": "adf-",
"dataLakeAnalyticsAccounts": "dla",
"dataLakeStoreAccounts": "dls",
"dataMigrationServices": "dms-",
"dBforMySQLServers": "mysql-",
"dBforPostgreSQLServers": "psql-",
"devicesIotHubs": "iot-",
"devicesProvisioningServices": "provs-",
"devicesProvisioningServicesCertificates": "pcert-",
"documentDBDatabaseAccounts": "cosmos-",
"eventGridDomains": "evgd-",
"eventGridDomainsTopics": "evgt-",
"eventGridEventSubscriptions": "evgs-",
"eventHubNamespaces": "evhns-",
"eventHubNamespacesEventHubs": "evh-",
"hdInsightClustersHadoop": "hadoop-",
"hdInsightClustersHbase": "hbase-",
"hdInsightClustersKafka": "kafka-",
"hdInsightClustersMl": "mls-",
"hdInsightClustersSpark": "spark-",
"hdInsightClustersStorm": "storm-",
"hybridComputeMachines": "arcs-",
"insightsActionGroups": "ag-",
"insightsComponents": "appi-",
"keyVaultVaults": "kv-",
"kubernetesConnectedClusters": "arck",
"kustoClusters": "dec",
"kustoClustersDatabases": "dedb",
"logicIntegrationAccounts": "ia-",
"logicWorkflows": "logic-",
"machineLearningServicesWorkspaces": "mlw-",
"managedIdentityUserAssignedIdentities": "id-",
"managementManagementGroups": "mg-",
"migrateAssessmentProjects": "migr-",
"networkApplicationGateways": "agw-",
"networkApplicationSecurityGroups": "asg-",
"networkAzureFirewalls": "afw-",
"networkBastionHosts": "bas-",
"networkConnections": "con-",
"networkDnsZones": "dnsz-",
"networkExpressRouteCircuits": "erc-",
"networkFirewallPolicies": "afwp-",
"networkFirewallPoliciesWebApplication": "waf",
"networkFirewallPoliciesRuleGroups": "wafrg",
"networkFrontDoors": "fd-",
"networkFrontdoorWebApplicationFirewallPolicies": "fdfp-",
"networkLoadBalancersExternal": "lbe-",
"networkLoadBalancersInternal": "lbi-",
"networkLoadBalancersInboundNatRules": "rule-",
"networkLocalNetworkGateways": "lgw-",
"networkNatGateways": "ng-",
"networkNetworkInterfaces": "nic-",
"networkNetworkSecurityGroups": "nsg-",
"networkNetworkSecurityGroupsSecurityRules": "nsgsr-",
"networkNetworkWatchers": "nw-",
"networkPrivateDnsZones": "pdnsz-",
"networkPrivateLinkServices": "pl-",
"networkPublicIPAddresses": "pip-",
"networkPublicIPPrefixes": "ippre-",
"networkRouteFilters": "rf-",
"networkRouteTables": "rt-",
"networkRouteTablesRoutes": "udr-",
"networkTrafficManagerProfiles": "traf-",
"networkVirtualNetworkGateways": "vgw-",
"networkVirtualNetworks": "vnet-",
"networkVirtualNetworksSubnets": "snet-",
"networkVirtualNetworksVirtualNetworkPeerings": "peer-",
"networkVirtualWans": "vwan-",
"networkVpnGateways": "vpng-",
"networkVpnGatewaysVpnConnections": "vcn-",
"networkVpnGatewaysVpnSites": "vst-",
"notificationHubsNamespaces": "ntfns-",
"notificationHubsNamespacesNotificationHubs": "ntf-",
"operationalInsightsWorkspaces": "log-",
"portalDashboards": "dash-",
"powerBIDedicatedCapacities": "pbi-",
"purviewAccounts": "pview-",
"recoveryServicesVaults": "rsv-",
"resourcesResourceGroups": "rg-",
"searchSearchServices": "srch-",
"serviceBusNamespaces": "sb-",
"serviceBusNamespacesQueues": "sbq-",
"serviceBusNamespacesTopics": "sbt-",
"serviceEndPointPolicies": "se-",
"serviceFabricClusters": "sf-",
"signalRServiceSignalR": "sigr",
"sqlManagedInstances": "sqlmi-",
"sqlServers": "sql-",
"sqlServersDataWarehouse": "sqldw-",
"sqlServersDatabases": "sqldb-",
"sqlServersDatabasesStretch": "sqlstrdb-",
"storageStorageAccounts": "st",
"storageStorageAccountsVm": "stvm",
"storSimpleManagers": "ssimp",
"streamAnalyticsCluster": "asa-",
"synapseWorkspaces": "syn",
"synapseWorkspacesAnalyticsWorkspaces": "synw",
"synapseWorkspacesSqlPoolsDedicated": "syndp",
"synapseWorkspacesSqlPoolsSpark": "synsp",
"timeSeriesInsightsEnvironments": "tsi-",
"webServerFarms": "plan-",
"webSitesAppService": "app-",
"webSitesAppServiceEnvironment": "ase-",
"webSitesFunctions": "func-",
"webStaticSites": "stapp-"
}
40 changes: 40 additions & 0 deletions infra/core/ai/cognitiveservices.bicep
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
param name string
param location string = resourceGroup().location
param tags object = {}

param customSubDomainName string = name
param deployments array = []
param kind string = 'OpenAI'
param publicNetworkAccess string = 'Enabled'
param sku object = {
name: 'S0'
}

resource account 'Microsoft.CognitiveServices/accounts@2022-10-01' = {
name: name
location: location
tags: tags
kind: kind
properties: {
customSubDomainName: customSubDomainName
publicNetworkAccess: publicNetworkAccess
}
sku: sku
}

@batchSize(1)
resource deployment 'Microsoft.CognitiveServices/accounts/deployments@2022-10-01' = [for deployment in deployments: {
parent: account
name: deployment.name
properties: {
model: deployment.model
raiPolicyName: contains(deployment, 'raiPolicyName') ? deployment.raiPolicyName : null
scaleSettings: deployment.scaleSettings
}
}]

output endpoint string = account.properties.endpoint
output id string = account.id
output name string = account.name
output skuName string = account.sku.name
output key string = account.listKeys().key1
Loading

0 comments on commit 5bc6895

Please sign in to comment.