2026/02/16

Apache Drill

Apache Drill 使用 Java 撰寫，它參考了谷歌的 BigQuery 的想法，是一個開放原始碼、可擴展、支援複雜資料的分散式 columnar SQL 查詢引擎，它可以訪問 Apache Hadoop 的 HDFS，從各種主流文件格式（例如 Parquet、JSON 和 CSV）以及支援的資料庫中讀取數據。 Apache Drill 的優點就是它可以使用 SQL 查詢語言查詢各種不同格式的資料並且與資料互動。

下載程式後解壓縮放在某個目錄，使用 embedded mode 檢查是否可以執行：

bin/drill-embedded

結束程式：

!quit

Distributed Mode 需要 Apache ZooKeeper，並且在 drill-override.conf 設定相關的設定。

drill.exec: {
  cluster-id: "drillbits1",
  zk.connect: "localhost:2181"
}

首先要先啟動 ZooKeeper service。
再來是啟動 Drill server：

bin/drillbit.sh start

停止 server：

bin/drillbit.sh stop

執行 sqlline 驗證可以連線到 Drill server：

bin/sqlline -u jdbc:drill:schema=dfs;zk=localhost

SSL/TLS

使用 keytool 建立 keystroe：

keytool -genkeypair -alias server \
-dname "CN=localhost, OU=IT Department, O=Orange Inc. ,L=Taipei, S=Taiwan,C=TW" \
-ext SAN=DNS:localhost,IP:127.0.0.1 \
-keyalg RSA -keysize 2048 -sigalg SHA256withRSA -storetype PKCS12 \
-validity 3650 \
-keypass password -keystore ./trusted.keystore -storepass password

在 drill-override.conf 設定相關的設定。

drill.exec: {
  cluster-id: "drillbits1",
  zk.connect: "localhost:2181",
  ssl: {
    protocol: "TLSv1.3",
    keyStoreType: "pkcs12",
    keyStorePath: "/home/danilo/Programs/drill/conf/trusted.keystore",
    keyStorePassword: "password",
    trustStoreType: "pkcs12",
    trustStorePath: "/home/danilo/Programs/drill/conf/trusted.keystore",
    trustStorePassword: "password"
  },
  security.user.encryption.ssl: {
    enabled: true,
  },
}

在 drill-override.conf 設定相關的設定，對 WEB UI 允許 SSL/TLS：

drill.exec: {
  cluster-id: "drillbits1",
  zk.connect: "localhost:2181",
  ssl: {
    protocol: "TLSv1.3",
    keyStoreType: "pkcs12",
    keyStorePath: "/home/danilo/Programs/drill/conf/trusted.keystore",
    keyStorePassword: "password",
    trustStoreType: "pkcs12",
    trustStorePath: "/home/danilo/Programs/drill/conf/trusted.keystore",
    trustStorePassword: "password"
  },
  security.user.encryption.ssl: {
    enabled: true,
  },
  http: {
    enabled: true,
    ssl_enabled: true,
  },
}

Plain Security

在 drill-override.conf 設定相關的設定，允許 Plain Security：

drill.exec: {
  cluster-id: "drillbits1",
  zk.connect: "localhost:2181",
  ssl: {
    protocol: "TLSv1.3",
    keyStoreType: "pkcs12",
    keyStorePath: "/home/danilo/Programs/drill/conf/trusted.keystore",
    keyStorePassword: "password",
    trustStoreType: "pkcs12",
    trustStorePath: "/home/danilo/Programs/drill/conf/trusted.keystore",
    trustStorePassword: "password"
  },
  security: {
    auth.mechanisms : ["PLAIN"],
  },
  security.user.auth {
    enabled: true,
    packages += "org.apache.drill.exec.rpc.user.security",
    impl: "pam4j",
    pam_profiles: [ "sudo", "login" ]
  },
  security.user.encryption.ssl: {
    enabled: true,
  },
  http: {
    enabled: true,
    ssl_enabled: true,
    auth: {
      mechanisms: ["FORM"],
    },
  },
}

Apache Drill 支援使用 libpam4j 或者是 jpam 作為 PAM Authenticator。其中 libpam4j 已被內建，所以只要設定正確就可以使用。

REST interface

啟動 HBase REST server（前景，使用 -port 指定 port）：

bin/hbase rest start -p 8090

啟動 HBase REST server（背景，使用 -port 指定 port）：

bin/hbase-daemon.sh start rest -p 8090

停止 HBaseHBase REST server（背景）：

bin/hbase-daemon.sh stop rest

SSL/TLS

使用 keytool 建立 keystroe：

keytool -genkeypair -alias server \
-dname "CN=localhost, OU=IT Department, O=Orange Inc. ,L=Taipei, S=Taiwan,C=TW" \
-ext SAN=DNS:localhost,IP:127.0.0.1 \
-keyalg RSA -keysize 2048 -sigalg SHA256withRSA -storetype PKCS12 \
-validity 3650 \
-keypass password -keystore ./trusted.keystore -storepass password

修改 hbase-site.xml

<configuration>
  <property>
    <name>hbase.rootdir</name>
    <value>file:///home/danilo/Programs/hbase</value>
  </property>
  <property>
    <name>hbase.tmp.dir</name>
    <value>./tmp</value>
  </property>
  <property>
    <name>hbase.unsafe.stream.capability.enforce</name>
    <value>false</value>
  </property>
  <property>
    <name>hbase.cluster.distributed</name>
    <value>true</value>
  </property>  
  <property>
    <name>hbase.zookeeper.property.dataDir</name>
    <value>/var/lib/zookeeper</value>
  </property>
  <property>
    <name>hbase.zookeeper.quorum</name>
    <value>localhost</value>
  </property>
  <property>
    <name>hbase.zookeeper.property.clientPort</name>
    <value>2181</value>
  </property>
  <property>
    <name>hbase.rest.ssl.enabled</name>
    <value>true</value>
  </property> 
  <property>
    <name>hbase.rest.ssl.keystore.store</name>
    <value>/home/danilo/Programs/hbase/conf/trusted.keystore</value>
  </property> 
  <property>
    <name>hbase.rest.ssl.keystore.password</name>
    <value>password</value>
  </property> 
  <property>
    <name>hbase.rest.ssl.keystore.type</name>
    <value>pkcs12</value>
  </property>
  <property>
    <name>hbase.rest.ssl.truststore.store</name>
    <value>/home/danilo/Programs/hbase/conf/trusted.keystore</value>
  </property>
  <property>
    <name>hbase.rest.ssl.truststore.password</name>
    <value>password</value>
  </property>
  <property>
    <name>hbase.rest.ssl.truststore.type</name>
    <value>pkcs12</value>
  </property>
</configuration>

再來重新啟動 HBase server 與 HBase REST server 即可。

簡介

Apache Geode 是一個數據管理平台，它為廣泛分佈的雲端架構中的資料密集型應用程式提供即時、一致的存取，一般而言作為 In-Memory Data Grid (IMDG)、快取 (cache) 以及需要即時處理的場合使用。 Apache Geode 提供了 SQL-like 查詢語言，稱為 OQL (Object Query Language)， Apache Geode 以 Java object 的方式儲存資料，所以選擇使用 OQL 查詢儲存的資料。

下載 Apache Geode 以後解壓縮放到某個目錄。gfsh 為 Apache Geode 用來管理的 shell tool。

執行 gfsh。下面是在 gfsh 執行的指令，資料來自於 Apache Geode in 15 Minutes or Less 的教學。

Locator 是 Geode 行程 (processes)，它告訴新連線的成員正在執行的成員在哪裡，並為伺服器使用提供負載平衡。

start locator --name=locator1

Geode 提供了 web 界面的監控界面，下面是啟動的指令。預設使用者為 admin，密碼為 admin。

start pulse

Geode server 是一個行程 (process)，它作為叢集中一個長期運行且可配置的成員而存在。 Geode server 主要用於託管長期運行的資料區域，以及運行標準的 Geode 行程，例如用戶端/伺服器配置中的伺服器。

start server --name=server1 --server-port=40411

Regions 是 Geode 叢集的核心建置模組，用於組織資料。在此練習中建立的 Region 採用複製機制在叢集成員之間複製數據，並利用持久化機制將資料儲存到磁碟。

create region --name=regionA --type=REPLICATE_PERSISTENT

列出目前的 regions：

list regions

列出 Geode 叢集的成員：

list members

描述 Geode region regionA 的資料：

describe region --name=regionA

下面使用 put 新增資料以及使用 query 查詢資料。

put --region=regionA --key="1" --value="one"

put --region=regionA --key="2" --value="two"

query --query="select * from /regionA"

如果你需要刪除一個 region，可以這樣做：

destroy region --name=regionA

如果要停止 server：

stop server --name=server1

關閉系統，包括 locator。

shutdown --include-locators=true

REST

Geode 讓使用者能夠使用 REST 介面存取資料。

啟動一個 locator。

start locator --name=locator1

並且使用以下的設定：

configure pdx --read-serialized=true --disk-store

然後在啟動 Geode server 時加入 --start-rest-api 選項。

start server --name=server1 --server-port=40411 \
--start-rest-api=true \
--http-service-port=8080 --http-service-bind-address=localhost

使用 curl 驗證是否可以使用：

curl -i http://localhost:8080/geode/v1

SSL

使用 keytool 建立 keystroe：

keytool -genkeypair -alias server \
-dname "CN=localhost, OU=IT Department, O=Orange Inc. ,L=Taipei, S=Taiwan,C=TW" \
-ext SAN=DNS:localhost,IP:127.0.0.1 \
-keyalg RSA -keysize 2048 -sigalg SHA256withRSA -storetype PKCS12 \
-validity 3650 \
-keypass password -keystore ./trusted.keystore -storepass password

在 Geode 的 config 目錄下建立一個新的檔案 gfsecurity.properties。 Apache Geode 使用 ssl-enabled-components 設定不同組件間的通訊是否需要使用 SSL/TLS。 all 表示全部都要使用，這裡設定為 web，表示使用在 REST 介面。

ssl-enabled-components=web
ssl-protocols=TLSv1.2,TLSv1.3
ssl-ciphers=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_AES_256_GCM_SHA384
ssl-keystore=/home/danilo/Programs/geode/config/trusted.keystore
ssl-keystore-password=password
ssl-keystore-type=pkcs12
ssl-truststore=/home/danilo/Programs/geode/config/trusted.keystore
ssl-truststore-password=password
ssl-truststore-type=pkcs12

使用 gfsh 啟動一個 locator。

start locator --name=locator1 --port=12345 \
--security-properties-file=/home/danilo/Programs/geode/config/gfsecurity.properties

並且使用以下的設定：

configure pdx --read-serialized=true --disk-store

使用 gfsh 啟動一個 server。

start server --name=server1 --server-port=40411 \
--start-rest-api=true \
--http-service-port=8080 --http-service-bind-address=localhost \
--security-properties-file=/home/danilo/Programs/geode/config/gfsecurity.properties

使用 curl 驗證是否可以使用：

curl -k -i https://localhost:8080/geode/v1

Authentication

以下的方式在 Java 24/25 以後，因為 SecurityManager 被禁止而無法使用。 In Java 17, the Security Manager was deprecated for removal under JEP 411. With JDK 24, its functionality will be effectively disabled. So you could not setup HTTP Basic Authentication support for Apache Geode by using SecurityManager since JDK 24.

將下列的內容儲存為 security.json，並且放到各個 locator 與 server 的目錄下。

{
  "roles": [
    {
      "name": "cluster",
      "operationsAllowed": [
        "CLUSTER:MANAGE",
        "CLUSTER:WRITE",
        "CLUSTER:READ"
      ]
    },
    {
      "name": "data",
      "operationsAllowed": [
        "DATA:MANAGE",
        "DATA:WRITE",
        "DATA:READ"
      ]
    },
    {
      "name": "region1&2Reader",
      "operationsAllowed": [
        "DATA:READ"
      ],
      "regions": ["region1", "region2"]
    }
  ],
  "users": [
    {
      "name": "super-user",
      "password": "1234567",
      "roles": [
        "cluster",
        "data"
      ]
    },
    {
      "name": "joebloggs",
      "password": "1234567",
      "roles": [
        "data"
      ]
    }
  ]
}

使用 gfsh 啟動一個 locator。

start locator --name=locator1 --port=12345 \
--security-properties-file=/home/danilo/Programs/geode/config/gfsecurity.properties \
--J=-Dgemfire.security-manager=org.apache.geode.examples.security.ExampleSecurityManager \
--classpath=.

連線到 locator 需要驗證：

connect --locator=localhost[12345] --user=super-user --password=1234567

並且使用以下的設定：

configure pdx --read-serialized=true --disk-store

使用 gfsh 啟動一個 server。

start server --name=server1 --locators=localhost[12345] --server-port=40411 \
--start-rest-api=true \
--http-service-port=8080 --http-service-bind-address=localhost \
--security-properties-file=/home/danilo/Programs/geode/config/gfsecurity.properties \
--J=-Dgemfire.security-manager=org.apache.geode.examples.security.ExampleSecurityManager \
--classpath=. --user=super-user --password=1234567

Memcached

Apache Geode 提供了 memcached 協議相容的介面。

使用 gfsh 啟動一個 locator。

start locator --name=locator1

並且在 config 新增或者修改 cache.xml，使用下列的設定：

<?xml version="1.0" encoding="UTF-8"?>
<cache xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
       xmlns="http://geode.apache.org/schema/cache" 
       xsi:schemaLocation="http://geode.apache.org/schema/cache http://geode.apache.org/schema/cache/cache-1.0.xsd"
       version="1.0">
  <region name="gemcached">
    <region-attributes refid="PARTITION" />
  </region>
</cache>

使用 gfsh 啟動一個 server。

start server --name=server1 --server-port=40411 \
--memcached-port=11211 --memcached-bind-address=localhost \
--memcached-protocol=BINARY \
--cache-xml-file=/home/danilo/Programs/geode/config/cache.xml

--memcached-protocol 可以設為 ASCII 或者是 BINARY。 ASCII 是 libMemcached 的預設值，如果要使用 BINARY 需要設定。

使用 memcached-for-Tcl 進行驗證。

package require Memcache

memcache server add localhost 11211
memcache behavior MEMCACHED_BEHAVIOR_BINARY_PROTOCOL 1
memcache set moo "cows go moo"
memcache get moo result
puts $result

XML 簡介

XML stands for Extensible Markup Language. It is a text-based markup language derived from Standard Generalized Markup Language (SGML).

XML is a markup language that defines set of rules for encoding documents in a format that is both human-readable and machine-readable. Following example shows how XML markup looks, when embedded in a piece of text −

<message>
   <text>Hello, world!</text>
</message>

You can notice there are two kinds of information in the above example −

Markup (tag)
The text, or the character data

XML 文件可以使用樹 (tree) 來表示。一個 XML 樹開始於 root element，並且從 root element 開始其 child elements 的分支。 XML elements 可以有屬性 (attributes)，例如下面的例子：

<note date="2023/04/28">
  <name>Orange</name>
</note>

The XML document can optionally have an XML declaration. It is written as follows −

<?xml version = "1.0" encoding = "UTF-8"?>

XPath 簡介

XPath (XML Path Language) is an expression language designed to support the query or transformation of XML documents. It was defined by the World Wide Web Consortium (W3C) in 1999.

In XPath, there are seven kinds of nodes: element, attribute, text, namespace, processing-instruction, comment, and root nodes.

The most important kind of expression in XPath is a location path. A location path consists of a sequence of location steps. Each location step has three components:

an axis
a node test
zero or more predicates.

Axis specifiers in XPath
Full syntax	Abbreviated syntax	Notes
`ancestor`
`ancestor-or-self`
`attribute`	`@`	`@abc` is short for `attribute::abc`
`child`		`xyz` is short for `child::xyz`
`descendant`
`descendant-or-self`	`//`	`//` is short for `/descendant-or-self::node()/`
`following`
`following-sibling`
`namespace`
`parent`	`..`	`..` is short for `parent::node()`
`preceding`
`preceding-sibling`
`self`	`.`	`.` is short for `self::node()`

Node tests may consist of specific node names or more general expressions. In the case of an XML document in which the namespace prefix gs has been defined, //gs:enquiry will find all the enquiry elements in that namespace, and //gs:* will find all elements, regardless of local name, in that namespace.

Other node test formats are:

comment(): finds an XML comment node, e.g. 
text(): finds a node of type text excluding any children, e.g. the hello in <k>hello<m> world</m></k>
processing-instruction(): finds XML processing instructions such as <?php echo $a; ?>. In this case, processing-instruction('php') would match.
node(): finds any node at all.

Predicates, written as expressions in square brackets, can be used to filter a node-set according to some condition. For example, a returns a node-set (all the a elements which are children of the context node), and a[@href='help.php'] keeps only those elements having an href attribute with the value help.php.

XQuery 簡介

XQuery 是由 W3C 定義的查詢語言，程式風格為函數式程式設計 (functional programming)，建立在 XPath 的基礎上，使用 Xpath 表達要查詢的路徑資訊，專門用於在結構化或半結構化 XML 資料中進行搜尋、操作和轉換，類似 SQL 之於關聯式資料庫。在 XQuery 3.1 增加了對於 JSON 的支援，所以 XQuery 也可以用來處理 JSON 資料（如果你想要這樣做的話）。

目前 XQuery 的版本如下：

XQuery 1.0 became a W3C Recommendation on January 23, 2007
XQuery 3.0 became a W3C Recommendation on April 8, 2014
XQuery 3.1 became a W3C Recommendation on March 21, 2017

XQuery 所有用於執行計算的 XQuery 語句都是表達式 (expressions)，其核心是 FLWOR，用於更複雜的查詢：

For：循環存取資料項目。
Let：賦值。
Where：設定篩選條件。
Order By：排序結果。
Return：定義輸出結果。

下面是一個 XQuery 的 Hello World 例子：

let $message := 'Hello World!'
return
<results>
   <message>{$message}</message>
</results>

下面是一個 XML 檔案 books.xml：

<?xml version="1.0" encoding="UTF-8"?>
<books>
   
   <book category="JAVA">
      <title lang="en">Learn Java in 24 Hours</title>
      <author>Robert</author>
      <year>2005</year>
      <price>30.00</price>
   </book>
   
   <book category="DOTNET">
      <title lang="en">Learn .Net in 24 hours</title>
      <author>Peter</author>
      <year>2011</year>
      <price>70.50</price>
   </book>
   
   <book category="XML">
      <title lang="en">Learn XQuery in 24 hours</title>
      <author>Robert</author>
      <author>Peter</author> 
      <year>2013</year>
      <price>50.00</price>
   </book>
   
   <book category="XML">
      <title lang="en">Learn XPath in 24 hours</title>
      <author>Jay Ban</author>
      <year>2010</year>
      <price>16.50</price>
   </book>
   
</books>

XQuery 可以使用 doc() 函數取得 XML 檔案的內容。下面就是一個 XQuery 的例子：

(: XQuery Comment :)
let $books := (doc("books.xml")/books/book)
return <results>
{
   for $x in $books
   where $x/price>30
   order by $x/price
   return $x/title
}
</results>

XQuery 可以使用 for 執行迴圈任務，如下面所示：

for $n in 1 to 10
return
    <result>{$n}</result>

Sequences represent an ordered collection of items where items can be of similar or of different types. Sequences are created using parenthesis with strings inside quotes or double quotes and numbers as such. XML elements can also be used as the items of a sequence.

Viewing items in a sequence

let $sequence := ('a', 'b', 'c', 'd', 'e', 'f')
let $count := count($sequence)
return
   <results>
      <count>{$count}</count>
      <items>
       {
         for $item in $sequence
         return
           <item>{$item}</item>
       }
      </items>
   </results>

XQuery 內建支援 Regular Expressions，下面是一個例子：

let $input := 'TutorialsPoint Simply Easy Learning'
return (
  matches($input, 'Hello') =  true(),
  matches($input, 'T.* S.* E.* L.*') =  true()
)

XQuery 使用 if-then-else 支援條件判斷。

<result>
{
   for $book in doc("books.xml")/books/book
   return
   if ($book/@category = "XML") then (
     $book/title
   )
}
</result>

XQuery 3.0 加入 lambda functions 的支援，下面是一個例子：

let $fn := function($x, $y) { $x + $y }
return $fn(99, 2)

XQuery 3.0 加入 switch 的支援，下面是一個例子：

for $fruit in ("Apple", "Pear", "Peach")
return switch ($fruit)
  case "Apple" return "red"
  case "Pear"  return "green"
  case "Peach" return "pink"
  default      return "unknown"

XQuery 3.0 加入 try catch 的支援，下面是一個例子：

try {
  1 + '2'
} catch * {
  'Error [' || $err:code || ']: ' || $err:description
}

XQuery 3.0 加入 || operator 作為 String Concatenations 使用，其實際上為 concat() 函數的快捷方式。

'Hello' || ' ' || 'Universe'

XQuery 3.0 加入 Simple Map Operator !，用於將第一個表達式的結果應用於第二個表達式，下面是一個例子：

(1 to 10) ! element node { . }

XQuery 3.1 加入 Arrow Operator operator =>，提供了一種方便的替代語法，用於將函數傳遞給值。運算子前面的表達式將提供作為箭頭後面函數的第一個參數。

'w e l c o m e' => upper-case() => tokenize() => string-join('-')

下面則是沒有 Arrow Operator operator 之前的寫法：

string-join(tokenize(upper-case('w e l c o m e')), '-')

XQuery 3.1 加入了 Map 與 Array 支援對於 JSON 資料格式的處理。 Map 是將一組鍵與值關聯起來的函數，從而產生一組鍵/值對，用來處理 JSON 的 object。 Array 是將一組位置（以正整數表示的鍵）與值關聯起來的功能。Array 中的第一個位置對應整數 1，用來處理 JSON 的 array。

let $map := map { 'foo': 42, 'bar': 'baz', 123: 456 }
return for-each(map:keys($map), $map)

let $array := array { 48 to 52 }
for $i in 1 to array:size($array)
return $array($i)

Lookup operator 提供了一種語法糖，用於存取 Map 或 Array 元素的值。它以問號 (?) 開頭，後面跟著一個說明符。說明符可以是：

A wildcard *,
The name of the key,
The integer offset, or
Any other parenthesized expression.

let $map := map { 'R': 'red', 'G': 'green', 'B': 'blue' }
return (
  $map?*           (: returns all values; same as: map:keys($map) ! $map(.) :),
  $map?R           (: returns the value for key 'R'; same as: $map('R') :),
  $map?('G', 'B')  (: returns the values for key 'G' and 'B' :)
)

let $maps := (
  map { 'name': 'Guðrún', 'city': 'Reykjavík' },
  map { 'name': 'Hildur', 'city': 'Akureyri' }
)
return $maps[?name = 'Hildur'] ?city

XQuery 3.1 提供了 JSON Serialization，下面是一個例子：

declare option output:method 'json';
map { "key": "value" }

XQuery 3.1 使用 fn:parse-json() 執行 JSON deserialization 的工作：

let $json-input := '{ "firstName": "John", "lastName": "Smith", "address": { "city": "New York" }, "phoneNumbers": ["212-732-1234", "646-123-4567"] }'
let $json-data := fn:parse-json($json-input)
return
  $json-data

XQuery 3.1 支援讀取外部的 JSON 文件檔案。

let $json-data := fn:json-doc("/path/to/data.json")
return $json-data

也可以將 JSON 轉換為 XML 文件：

let $json-string := '{ "name": "John", "age": 30, "city": "New York" }'
return fn:json-to-xml($json-string)

參考資料

訂閱：意見 (Atom)

Hun Speaking

2026/02/16

Apache Drill

SSL/TLS

Plain Security

相關連結

2026/02/12

Apache HBase

REST interface

SSL/TLS

相關連結

2026/02/08

Apache ZooKeeper

相關連結

2026/02/04

Apache Geode

簡介

REST

SSL

Authentication

Memcached

相關連結

2026/02/02

BaseX database

Tools

相關連結

XQuery 學習筆記

XML 簡介

XPath 簡介

XQuery 簡介

參考資料

熱門文章