Posts

Data catalogs solve the problem by tagging fields and data sets with consistent business terms and providing a shopping-type interface that allows the users to find data sets by describing what they are looking for using the business terms that they are used to, and to understand the data in those data sets through tags and descriptions that use business terms. Data lakes are the do-it-yourself version of a data warehouse, allowing data engineering teams to pick and choose the various metadata, storage, and compute technologies they want to use depending on the needs of their systems.

多租户的运行 不同的应用运行在同一个集群 相同应用的多个实例运行在同一个集群 敏感数据应用独享实例,非敏感数据共享实例 hybrid architectures are also possible, such as a SaaS provider using a combination of per-customer workloads for sensitive data, combined with multi-tenant shared services. 隔离方式 控制面隔离机制 名字空间 访问控制 资源配额 数据面隔离机制 网络隔离 Pod-to-pod communication can be controlled using Network Policies, which restrict communication between pods using namespace labels or IP address ranges. In a multi-tenant environment where strict network isolation between tenants is required, starting with a default policy that denies communication between pods is recommended with another rule that allows all pods to query the DNS server for name resolution

配置

’.azure/config`

[defaults]
location = westus

[cloud]
name = AzureCloud

[core]
output = table
az config set defaults.location=westus2 defaults.group=MyResourceGroup

az v2不支持config,直接修改配置文件

AZURE REGION az account list-locations --query "sort_by([].{DisplayName:displayName, Name:name}, &DisplayName)" --output table 创建函数APP #!/bin/bash # Function app and storage account names must be unique. storageName=mystorageaccount$RANDOM functionAppName=myserverlessfunc$RANDOM region=westeurope # Create a resource group. az group create --name myResourceGroup --location $region # Create an Azure storage account in the resource group. az storage account create \ --name $storageName \ --location $region \ --resource-group myResourceGroup \ --sku Standard_LRS # Create a serverless function app in the resource group.

安装FUNC npm i -D azure-functions-core-tools@3 export PATH=./ export CLI_DEBUG=1 func host start --verbose 安装playwright-chromium export PLAYWRIGHT_BROWSERS_PATH=0 npm install [email protected] 确认chrome的存放路径 node_modules/playwright-chromium/.local-browsers/chromium-792639 创建函数 /home/ubuntu/sls/azure-sls/node_modules/.bin/func init func new func start 本地测试 export CLI_DEBUG=1 func host start --verbose host.json { "version": "2.0", "logging": { "applicationInsights": { "samplingSettings": { "isEnabled": true, "excludedTypes": "Request" } } }, "extensionBundle": { "id": "Microsoft.Azure.Functions.ExtensionBundle", "version": "[2.*, 3.0.0)" } } 如果遇到如下问题 Value cannot be null.

创建函数 npm install -g [email protected] sls -v Framework Core: 2.65.0 Plugin: 5.5.1 SDK: 4.3.0 Components: 3.17.2 sls create -t azure-nodejs -p azure-fn cd azure-fn npm install npm list |grep serverless-azure-functions └─┬ [email protected] 部署函数 set AZURE_SUBSCRIPTION_ID=02a23ad5- set AZURE_TENANT_ID=e9950462 set AZURE_CLIENT_ID=39258bc8 set AZURE_CLIENT_SECRET=hYdvD0 sls deploy --dryrun 测试 sls invoke -f hello -d '{"name": "Azure"}' 清理 empty.json { "$schema": "https://schema.management.azure.com/schemas/2015-01-01/deploymentTemplate.json#", "contentVersion": "1.0.0.0", "parameters": { }, "variables": { }, "resources": [ ], "outputs": { } } az group deployment create --mode complete --template-file .

CQuasar App Flow In order to better understand how a boot file works and what it does, you need to understand how your website/app boots: Quasar is initialized (components, directives, plugins, Quasar i18n, Quasar icon sets) Quasar Extras get imported (Roboto font – if used, icons, animations, …) Quasar CSS & your app’s global CSS are imported App.vue is loaded (not yet being used) Store is imported (if using Vuex Store in src/store) Router is imported (in src/router) Boot files are imported Router default export function executed Boot files get their default export function executed (if on Electron mode) Electron is imported and injected into Vue prototype (if on Cordova mode) Listening for “deviceready” event and only then continuing with following steps Instantiating Vue with root component and attaching to DOM

计算时间 vs 请求数量 平均计算时长 427586 / 27386 = 15.6 (秒) 设置默认配置 aws configure list-profiles default us-east-1 us-east-2 us-west-2 us-west-1 eu eu-west-1 af-south-1 ap-east-1 ap-south-1 ap-northeast-3 ap-northeast-2 ap-southeast-1 ap-southeast-2 ca-central-1 eu-west-2 eu-south-1 eu-west-3 eu-north-1 me-south-1 sa-east-1 export AWS_PROFILE=us 下载部署包 aws lambda get-function --function-name webdriver "Code": { "RepositoryType": "S3", "Location": "https://awslambda-ap-ne-1-tasks.s3.ap-northeast-1.amazonaws.com/snapshots/webdriver-aeb2eb63-9baf-4d06-9d3a-79459b172200?versionId=a71tk2dwwmvW1lPNB5VHKq8SbGS3laqE&X-Amz-Security-Token=IQoJb3JpZ2luX2VjEHIaDmFwLW5vcnRoZWFzdC0xIkcwRQIhAMRkIxPh1Fkd2nlCzgiDbsrmnCZEVunHibw2Cm6cyRIUAiB5t60IO6iESPDeUsTuQEjGyLfI73QyMK1mJY9Al70yECqNBAj8%2F%2F%2F%2F%2F%2F%2F%2F%2F%2F8BEAIaDDkxOTk4MDkyNTEzOSIMjVD0S8e1HGmJujr6KuEDO8SCL9OcolFOwL4IKMbE3euJLEtiGjSxH6c8jRPbnjp07Zf%2BxrOfJmWT2MORQs0RAQSLJV5nOFfRWTIPI4dSNhI3Q628XqklZ8%2BF1UktvA5vRdEU3LhDvOSsDCmL19k&X-Amz-Signature=7f876918ec5283db390a3037512e7ad62e434330ec3e406db18b25f25f3da0fe" 从Location下载部署包 PROF = "eu-central-1" aws lambda create-function --function-name webdriver --runtime nodejs12.x --zip-file fileb:///home/ubuntu/webdriver.zip --handler index.handler --role arn:aws:iam::762491489154:role/service-role/webdriver-role-3hxi35t5 --timeout 63 --memory-size 1024 --layers arn:aws:lambda:us-east-1:764866452798:layer:chrome-aws-lambda:25 --profile us 配置 role-policy.

aws客户端环境准备 git clone pip install awscli https://github.com/wubigo/API.git python API/python/aws/aws.py cp API/python/aws/cred.json ~/.aws/credentials cp API/python/aws/config ~/.aws/config 创建函数部署包 mkdir lambda_web wget https://github.com/wubigo/API/blob/master/nodejs/lambda/aws/index.js -P lambda_web zip -r webdriver.zip lambda_web/* 配置 policy.json { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Principal": { "Service": ["lambda.amazonaws.com", "s3.amazonaws.com"] }, "Action": "sts:AssumeRole" } ] } export ACCOUNT_ID=820934811997 aws iam create-role --role-name lambda-s3 --assume-role-policy-document file://policy.json aws iam attach-role-policy --role-name lambda-s3 --policy-arn arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole aws iam attach-role-policy --role-name lambda-s3 --policy-arn arn:aws:iam::aws:policy/AmazonS3FullAccess 复制 aws lambda create-function --function-name webdriver --runtime nodejs12.

HDFS The maximum number of files in HDFS depends on two things: total storage space in the cluster the heap size of the NameNode. 1) You can find out what percentage of storage has been used from HDFS NameNode UI. 2) The basic rule of thumb is that 1 GB heap is needed for every million of files. Each file object and each block object takes about 150 bytes of the memory.

用户代理 mobile devices browsing the web often see a pared-down ver‐ sion of sites, lacking banner ads, Flash, and other distractions. If you try changing your User-Agent to something like the following, you might find that sites get a little easier to scrape! User-Agent:Mozilla/5.0 (iPhone; CPU iPhone OS 7_1_2 like Mac OS X) AppleWebKit/537.51.2 (KHTML, like Gecko) Version/7.0 Mobile/11D257 Safari/9537.53 scrapy architecture The data flow in Scrapy is controlled by the execution engine, and goes like this:

Replication Method Replication Method - Replication method to use for extracting data from the database. STANDARD replication requires no setup on the DB side but will not be able to represent deletions incrementally. CDC uses the Binlog to detect inserts, updates, and deletes. This needs to be configured on the source database itself. S3 Support in Apache Hadoop Apache Hadoop ships with a connector to S3 called “S3A”, with the url prefix “s3a:“; its previous connectors “s3”, and “s3n” are deprecated and/or deleted from recent Hadoop versions.

打开阅读模式

Open a new tab and enter chrome://flags/#enable-reader-mode to jump directly to the Reader Mode Flag.

使用

Reader Mode in Chrome is mad-easy to use. When you’re on a page that you’d like to push into the reader view, click on the three-dot menu button in the upper right, and then choose “Distill page.”

升级YARN npm install -g [email protected] > [email protected] preinstall C:\local\node-v14.17.5-win-x64\node_modules\yarn > :; (node ./preinstall.js > /dev/null 2>&1 || true) C:\local\node-v14.17.5-win-x64\yarn -> C:\local\node-v14.17.5-win-x64\node_modules\yarn\bin\yarn.js C:\local\node-v14.17.5-win-x64\yarnpkg -> C:\local\node-v14.17.5-win-x64\node_modules\yarn\bin\yarn.js + [email protected] updated 1 package in 0.7s yarn link /home/ubuntu/.config/yarn/link/matrix-js-sdk ubuntu@ip-172-31-44-135:~/.config/yarn/link$ ll total 8 drwxrwxr-x 2 ubuntu ubuntu 4096 Aug 19 09:31 ./ drwxrwxr-x 3 ubuntu ubuntu 4096 Aug 19 07:25 ../ lrwxrwxrwx 1 ubuntu ubuntu 22 Aug 19 09:31 matrix-js-sdk -> ../../../matrix-js-sdk/ lrwxrwxrwx 1 ubuntu ubuntu 25 Aug 19 07:25 matrix-react-sdk -> .

准备环境

node -v
v14.17.5
npm install -g [email protected]

链接SDK

git clone https://github.com/matrix-org/matrix-js-sdk.git
pushd matrix-js-sdk
yarn link
yarn install
popd


git clone https://github.com/matrix-org/matrix-react-sdk.git
pushd matrix-react-sdk
yarn link
yarn link matrix-js-sdk
yarn install
popd



git clone https://github.com/vector-im/element-web.git
cd element-web
yarn link matrix-js-sdk
yarn link matrix-react-sdk
yarn install
yarn reskindex
yarn start

启动

cp config.sample.json config.json

curl http://127.0.0.1:8080/

登录

homeserver:http://192.168.43.16:8008/

The matrix specification does not enforce how users register with a server. It just specifies the URL path and absolute minimum keys. The reference homeserver uses a username/password to authenticate user, but other homeservers may use different methods. This is why you need to specify the type of method

Matrix Traditionally Matrix decentralises communication by replicating conversation history over a mesh of servers, so that no single server has ownership of a given conversation. Meanwhile, users connect to their given homeserver from clients via plain HTTPS + DNS. This has the significant disadvantage that for a user to have full control and ownership over their communication, they need to run their own server - which comes with a cost, and requires you to be a proficient sysadmin.

软件项目开发时间线很难做到精确的评估,但在项目管理的时候, 我们必须要提供这个数字。 重要特征:时间和不确定性 一个只包含时间的评估隐含了一定程度的不确定性。 如果你告诉我一个任务需要花10天时间, 我会假设进度在控制之中。但如果你告诉我一个任务花10到15天,那这个期限就有很高的风险。 项目工期评估技巧 任务分解 评估不确定性 分别做最好和最坏情况的评估 任务跟踪并优化 任务分解 复杂度 时间(人天) 小 1 中等 3 大 5 特别大 10 不用追求一步到位完成工时评估, 开始的时候做粗略的评估, 该评估可能包含一些复杂度比较大的任务。 评估不确定性 20-30人日与5-45人日的工时评估是完全不同的,即便两者的中位数都是25人天 不确定性级别 乘数效应 低 1.1 中等 1.5 大 2 特别大 5 计算工时 任务 复杂度 不确定性 期望工时 最坏情况 计量支付私有化部署 低 大 20 40 这个任务预计20天完成,最坏情况需求40天。或者工时可以表示为20-40人日

通过浏览器提交表单

通过curl提交表单

spring-boot

map HTTP request header Content-Type, handle request body.

@RequestParam ← application/x-www-form-urlencoded,

@RequestBody ← application/json,

@RequestPart ← multipart/form-data,

form-data

应用优化 Eliminating unnecessary data transfers is, of course, the single best optimization—e.g., eliminating unnecessary resources or ensuring that the minimum number of bits is transferred by applying the appropriate compression algorithm. Following that, locating the bits closer to the client, by geo-distributing servers around the world—e.g., using a CDN—will help reduce latency of network roundtrips and significantly improve TCP performance. Finally, where possible, existing TCP connections should be reused to minimize overhead imposed by slow-start and other congestion mechanisms