Project import generated by Copybara.

GitOrigin-RevId: 33adfdf31f3a5cbf9edc07ee1ea583e95080bdc5
wongfei · Jun 24, 2021 · 1392370 · 1392370
1 parent b544a31
commit 1392370
Show file tree

Hide file tree

Showing 151 changed files with 4,023 additions and 1,001 deletions.
diff --git a/MANIFEST.in b/MANIFEST.in
@@ -8,6 +8,7 @@ include README.md
 include requirements.txt
 
 recursive-include mediapipe/modules *.tflite *.txt *.binarypb
+exclude mediapipe/modules/face_detection/face_detection_full_range.tflite
 exclude mediapipe/modules/objectron/object_detection_3d_chair_1stage.tflite
 exclude mediapipe/modules/objectron/object_detection_3d_sneakers_1stage.tflite
 exclude mediapipe/modules/objectron/object_detection_3d_sneakers.tflite

diff --git a/README.md b/README.md
@@ -55,46 +55,22 @@ See also
 [MediaPipe Models and Model Cards](https://google.github.io/mediapipe/solutions/models)
 for ML models released in MediaPipe.
 
-## MediaPipe in Python
-
-MediaPipe offers customizable Python solutions as a prebuilt Python package on
-[PyPI](https://pypi.org/project/mediapipe/), which can be installed simply with
-`pip install mediapipe`. It also provides tools for users to build their own
-solutions. Please see
-[MediaPipe in Python](https://google.github.io/mediapipe/getting_started/python)
-for more info.
-
-## MediaPipe on the Web
-
-MediaPipe on the Web is an effort to run the same ML solutions built for mobile
-and desktop also in web browsers. The official API is under construction, but
-the core technology has been proven effective. Please see
-[MediaPipe on the Web](https://developers.googleblog.com/2020/01/mediapipe-on-web.html)
-in Google Developers Blog for details.
-
-You can use the following links to load a demo in the MediaPipe Visualizer, and
-over there click the "Runner" icon in the top bar like shown below. The demos
-use your webcam video as input, which is processed all locally in real-time and
-never leaves your device.
-
-![visualizer_runner](docs/images/visualizer_runner.png)
-
-*   [MediaPipe Face Detection](https://viz.mediapipe.dev/demo/face_detection)
-*   [MediaPipe Iris](https://viz.mediapipe.dev/demo/iris_tracking)
-*   [MediaPipe Iris: Depth-from-Iris](https://viz.mediapipe.dev/demo/iris_depth)
-*   [MediaPipe Hands](https://viz.mediapipe.dev/demo/hand_tracking)
-*   [MediaPipe Hands (palm/hand detection only)](https://viz.mediapipe.dev/demo/hand_detection)
-*   [MediaPipe Pose](https://viz.mediapipe.dev/demo/pose_tracking)
-*   [MediaPipe Hair Segmentation](https://viz.mediapipe.dev/demo/hair_segmentation)
-
 ## Getting started
 
-Learn how to [install](https://google.github.io/mediapipe/getting_started/install)
-MediaPipe and
-[build example applications](https://google.github.io/mediapipe/getting_started/building_examples),
-and start exploring our ready-to-use
-[solutions](https://google.github.io/mediapipe/solutions/solutions) that you can
-further extend and customize.
+To start using MediaPipe
+[solutions](https://google.github.io/mediapipe/solutions/solutions) with only a few
+lines code, see example code and demos in
+[MediaPipe in Python](https://google.github.io/mediapipe/getting_started/python) and
+[MediaPipe in JavaScript](https://google.github.io/mediapipe/getting_started/javascript).
+
+To use MediaPipe in C++, Android and iOS, which allow further customization of
+the [solutions](https://google.github.io/mediapipe/solutions/solutions) as well as
+building your own, learn how to
+[install](https://google.github.io/mediapipe/getting_started/install) MediaPipe and
+start building example applications in
+[C++](https://google.github.io/mediapipe/getting_started/cpp),
+[Android](https://google.github.io/mediapipe/getting_started/android) and
+[iOS](https://google.github.io/mediapipe/getting_started/ios).
 
 The source code is hosted in the
 [MediaPipe Github repository](https://github.com/google/mediapipe), and you can

diff --git a/WORKSPACE b/WORKSPACE
@@ -351,8 +351,8 @@ maven_install(
         "androidx.test.espresso:espresso-core:3.1.1",
         "com.github.bumptech.glide:glide:4.11.0",
         "com.google.android.material:material:aar:1.0.0-rc01",
-        "com.google.auto.value:auto-value:1.6.4",
-        "com.google.auto.value:auto-value-annotations:1.6.4",
+        "com.google.auto.value:auto-value:1.8.1",
+        "com.google.auto.value:auto-value-annotations:1.8.1",
         "com.google.code.findbugs:jsr305:3.0.2",
         "com.google.flogger:flogger-system-backend:0.3.1",
         "com.google.flogger:flogger:0.3.1",

diff --git a/docs/getting_started/android_archive_library.md b/docs/getting_started/android_archive_library.md
@@ -92,12 +92,12 @@ each project.
     and copy
     [the binary graph](https://github.com/google/mediapipe/blob/master/mediapipe/examples/android/src/java/com/google/mediapipe/apps/facedetectiongpu/BUILD#L41)
     and
-    [the face detection tflite model](https://github.com/google/mediapipe/tree/master/mediapipe/modules/face_detection/face_detection_front.tflite).
+    [the face detection tflite model](https://github.com/google/mediapipe/tree/master/mediapipe/modules/face_detection/face_detection_short_range.tflite).
 
     ```bash
     bazel build -c opt mediapipe/graphs/face_detection:face_detection_mobile_gpu_binary_graph
     cp bazel-bin/mediapipe/graphs/face_detection/face_detection_mobile_gpu.binarypb /path/to/your/app/src/main/assets/
-    cp mediapipe/modules/face_detection/face_detection_front.tflite /path/to/your/app/src/main/assets/
+    cp mediapipe/modules/face_detection/face_detection_short_range.tflite /path/to/your/app/src/main/assets/
     ```
 
     ![Screenshot](../images/mobile/assets_location.png)
@@ -117,15 +117,14 @@ each project.
         implementation 'com.google.flogger:flogger-system-backend:0.3.1'
         implementation 'com.google.code.findbugs:jsr305:3.0.2'
         implementation 'com.google.guava:guava:27.0.1-android'
-        implementation 'com.google.guava:guava:27.0.1-android'
         implementation 'com.google.protobuf:protobuf-java:3.11.4'
         // CameraX core library
         def camerax_version = "1.0.0-beta10"
         implementation "androidx.camera:camera-core:$camerax_version"
         implementation "androidx.camera:camera-camera2:$camerax_version"
         implementation "androidx.camera:camera-lifecycle:$camerax_version"
         // AutoValue
-        def auto_value_version = "1.6.4"
+        def auto_value_version = "1.8.1"
         implementation "com.google.auto.value:auto-value-annotations:$auto_value_version"
         annotationProcessor "com.google.auto.value:auto-value:$auto_value_version"
     }

diff --git a/docs/images/mobile/pose_world_landmarks.mp4 b/docs/images/mobile/pose_world_landmarks.mp4
diff --git a/docs/index.md b/docs/index.md
@@ -55,46 +55,22 @@ See also
 [MediaPipe Models and Model Cards](https://google.github.io/mediapipe/solutions/models)
 for ML models released in MediaPipe.
 
-## MediaPipe in Python
-
-MediaPipe offers customizable Python solutions as a prebuilt Python package on
-[PyPI](https://pypi.org/project/mediapipe/), which can be installed simply with
-`pip install mediapipe`. It also provides tools for users to build their own
-solutions. Please see
-[MediaPipe in Python](https://google.github.io/mediapipe/getting_started/python)
-for more info.
-
-## MediaPipe on the Web
-
-MediaPipe on the Web is an effort to run the same ML solutions built for mobile
-and desktop also in web browsers. The official API is under construction, but
-the core technology has been proven effective. Please see
-[MediaPipe on the Web](https://developers.googleblog.com/2020/01/mediapipe-on-web.html)
-in Google Developers Blog for details.
-
-You can use the following links to load a demo in the MediaPipe Visualizer, and
-over there click the "Runner" icon in the top bar like shown below. The demos
-use your webcam video as input, which is processed all locally in real-time and
-never leaves your device.
-
-![visualizer_runner](images/visualizer_runner.png)
-
-*   [MediaPipe Face Detection](https://viz.mediapipe.dev/demo/face_detection)
-*   [MediaPipe Iris](https://viz.mediapipe.dev/demo/iris_tracking)
-*   [MediaPipe Iris: Depth-from-Iris](https://viz.mediapipe.dev/demo/iris_depth)
-*   [MediaPipe Hands](https://viz.mediapipe.dev/demo/hand_tracking)
-*   [MediaPipe Hands (palm/hand detection only)](https://viz.mediapipe.dev/demo/hand_detection)
-*   [MediaPipe Pose](https://viz.mediapipe.dev/demo/pose_tracking)
-*   [MediaPipe Hair Segmentation](https://viz.mediapipe.dev/demo/hair_segmentation)
-
 ## Getting started
 
-Learn how to [install](https://google.github.io/mediapipe/getting_started/install)
-MediaPipe and
-[build example applications](https://google.github.io/mediapipe/getting_started/building_examples),
-and start exploring our ready-to-use
-[solutions](https://google.github.io/mediapipe/solutions/solutions) that you can
-further extend and customize.
+To start using MediaPipe
+[solutions](https://google.github.io/mediapipe/solutions/solutions) with only a few
+lines code, see example code and demos in
+[MediaPipe in Python](https://google.github.io/mediapipe/getting_started/python) and
+[MediaPipe in JavaScript](https://google.github.io/mediapipe/getting_started/javascript).
+
+To use MediaPipe in C++, Android and iOS, which allow further customization of
+the [solutions](https://google.github.io/mediapipe/solutions/solutions) as well as
+building your own, learn how to
+[install](https://google.github.io/mediapipe/getting_started/install) MediaPipe and
+start building example applications in
+[C++](https://google.github.io/mediapipe/getting_started/cpp),
+[Android](https://google.github.io/mediapipe/getting_started/android) and
+[iOS](https://google.github.io/mediapipe/getting_started/ios).
 
 The source code is hosted in the
 [MediaPipe Github repository](https://github.com/google/mediapipe), and you can

diff --git a/docs/solutions/face_detection.md b/docs/solutions/face_detection.md
@@ -45,6 +45,15 @@ section.
 
 Naming style and availability may differ slightly across platforms/languages.
 
+#### model_selection
+
+An integer index `0` or `1`. Use `0` to select a short-range model that works
+best for faces within 2 meters from the camera, and `1` for a full-range model
+best for faces within 5 meters. For the full-range option, a sparse model is
+used for its improved inference speed. Please refer to the
+[model cards](./models.md#face_detection) for details. Default to `0` if not
+specified.
+
 #### min_detection_confidence
 
 Minimum confidence value (`[0.0, 1.0]`) from the face detection model for the
@@ -72,6 +81,7 @@ install MediaPipe Python package, then learn more in the companion
 
 Supported configuration options:
 
+*   [model_selection](#model_selection)
 *   [min_detection_confidence](#min_detection_confidence)
 
 ```python
@@ -83,7 +93,7 @@ mp_drawing = mp.solutions.drawing_utils
 # For static images:
 IMAGE_FILES = []
 with mp_face_detection.FaceDetection(
-    min_detection_confidence=0.5) as face_detection:
+    model_selection=1, min_detection_confidence=0.5) as face_detection:
   for idx, file in enumerate(IMAGE_FILES):
     image = cv2.imread(file)
     # Convert the BGR image to RGB and process it with MediaPipe Face Detection.
@@ -103,7 +113,7 @@ with mp_face_detection.FaceDetection(
 # For webcam input:
 cap = cv2.VideoCapture(0)
 with mp_face_detection.FaceDetection(
-    min_detection_confidence=0.5) as face_detection:
+    model_selection=0, min_detection_confidence=0.5) as face_detection:
   while cap.isOpened():
     success, image = cap.read()
     if not success:
@@ -139,6 +149,7 @@ and the following usage example.
 
 Supported configuration options:
 
+*   [modelSelection](#model_selection)
 *   [minDetectionConfidence](#min_detection_confidence)
 
 ```html
@@ -189,6 +200,7 @@ const faceDetection = new FaceDetection({locateFile: (file) => {
   return `https://cdn.jsdelivr.net/npm/@mediapipe/[email protected]/${file}`;
 }});
 faceDetection.setOptions({
+  modelSelection: 0
   minDetectionConfidence: 0.5
 });
 faceDetection.onResults(onResults);
@@ -255,10 +267,6 @@ same configuration as the GPU pipeline, runs entirely on CPU.
     *   Target:
         [`mediapipe/examples/desktop/face_detection:face_detection_gpu`](https://github.com/google/mediapipe/tree/master/mediapipe/examples/desktop/face_detection/BUILD)
 
-### Web
-
-Please refer to [these instructions](../index.md#mediapipe-on-the-web).
-
 ### Coral
 
 Please refer to

diff --git a/docs/solutions/face_mesh.md b/docs/solutions/face_mesh.md
@@ -69,7 +69,7 @@ and renders using a dedicated
 The
 [face landmark subgraph](https://github.com/google/mediapipe/tree/master/mediapipe/modules/face_landmark/face_landmark_front_gpu.pbtxt)
 internally uses a
-[face_detection_subgraph](https://github.com/google/mediapipe/tree/master/mediapipe/modules/face_detection/face_detection_front_gpu.pbtxt)
+[face_detection_subgraph](https://github.com/google/mediapipe/tree/master/mediapipe/modules/face_detection/face_detection_short_range_gpu.pbtxt)
 from the
 [face detection module](https://github.com/google/mediapipe/tree/master/mediapipe/modules/face_detection).
 

diff --git a/docs/solutions/hair_segmentation.md b/docs/solutions/hair_segmentation.md
@@ -51,7 +51,14 @@ to visualize its associated subgraphs, please see
 
 ### Web
 
-Please refer to [these instructions](../index.md#mediapipe-on-the-web).
+Use [this link](https://viz.mediapipe.dev/demo/hair_segmentation) to load a demo
+in the MediaPipe Visualizer, and over there click the "Runner" icon in the top
+bar like shown below. The demos use your webcam video as input, which is
+processed all locally in real-time and never leaves your device. Please see
+[MediaPipe on the Web](https://developers.googleblog.com/2020/01/mediapipe-on-web.html)
+in Google Developers Blog for details.
+
+![visualizer_runner](../images/visualizer_runner.png)
 
 ## Resources
 

diff --git a/docs/solutions/holistic.md b/docs/solutions/holistic.md
@@ -176,6 +176,16 @@ A list of pose landmarks. Each landmark consists of the following:
 *   `visibility`: A value in `[0.0, 1.0]` indicating the likelihood of the
     landmark being visible (present and not occluded) in the image.
 
+#### pose_world_landmarks
+
+Another list of pose landmarks in world coordinates. Each landmark consists of
+the following:
+
+*   `x`, `y` and `z`: Real-world 3D coordinates in meters with the origin at the
+    center between hips.
+*   `visibility`: Identical to that defined in the corresponding
+    [pose_landmarks](#pose_landmarks).
+
 #### face_landmarks
 
 A list of 468 face landmarks. Each landmark consists of `x`, `y` and `z`. `x`
@@ -245,6 +255,9 @@ with mp_holistic.Holistic(
     mp_drawing.draw_landmarks(
         annotated_image, results.pose_landmarks, mp_holistic.POSE_CONNECTIONS)
     cv2.imwrite('/tmp/annotated_image' + str(idx) + '.png', annotated_image)
+    # Plot pose world landmarks.
+    mp_drawing.plot_landmarks(
+        results.pose_world_landmarks, mp_holistic.POSE_CONNECTIONS)
 
 # For webcam input:
 cap = cv2.VideoCapture(0)

diff --git a/docs/solutions/iris.md b/docs/solutions/iris.md
@@ -69,7 +69,7 @@ and renders using a dedicated
 The
 [face landmark subgraph](https://github.com/google/mediapipe/tree/master/mediapipe/modules/face_landmark/face_landmark_front_gpu.pbtxt)
 internally uses a
-[face detection subgraph](https://github.com/google/mediapipe/tree/master/mediapipe/modules/face_detection/face_detection_front_gpu.pbtxt)
+[face detection subgraph](https://github.com/google/mediapipe/tree/master/mediapipe/modules/face_detection/face_detection_short_range_gpu.pbtxt)
 from the
 [face detection module](https://github.com/google/mediapipe/tree/master/mediapipe/modules/face_detection).
 
@@ -193,7 +193,17 @@ on how to build MediaPipe examples.
 
 ### Web
 
-Please refer to [these instructions](../index.md#mediapipe-on-the-web).
+You can use the following links to load a demo in the MediaPipe Visualizer, and
+over there click the "Runner" icon in the top bar like shown below. The demos
+use your webcam video as input, which is processed all locally in real-time and
+never leaves your device. Please see
+[MediaPipe on the Web](https://developers.googleblog.com/2020/01/mediapipe-on-web.html)
+in Google Developers Blog for details.
+
+![visualizer_runner](../images/visualizer_runner.png)
+
+*   [MediaPipe Iris](https://viz.mediapipe.dev/demo/iris_tracking)
+*   [MediaPipe Iris: Depth-from-Iris](https://viz.mediapipe.dev/demo/iris_depth)
 
 ## Resources
 

diff --git a/docs/solutions/models.md b/docs/solutions/models.md
@@ -14,17 +14,27 @@ nav_order: 30
 
 ### [Face Detection](https://google.github.io/mediapipe/solutions/face_detection)
 
-*   Face detection model for front-facing/selfie camera:
-    [TFLite model](https://github.com/google/mediapipe/tree/master/mediapipe/modules/face_detection/face_detection_front.tflite),
+*   Short-range model (best for faces within 2 meters from the camera):
+    [TFLite model](https://github.com/google/mediapipe/tree/master/mediapipe/modules/face_detection/face_detection_short_range.tflite),
     [TFLite model quantized for EdgeTPU/Coral](https://github.com/google/mediapipe/tree/master/mediapipe/examples/coral/models/face-detector-quantized_edgetpu.tflite),
     [Model card](https://mediapipe.page.link/blazeface-mc)
-*   Face detection model for back-facing camera:
-    [TFLite model](https://github.com/google/mediapipe/tree/master/mediapipe/modules/face_detection/face_detection_back.tflite),
+*   Full-range model (dense, best for faces within 5 meters from the camera):
+    [TFLite model](https://github.com/google/mediapipe/tree/master/mediapipe/modules/face_detection/face_detection_full_range.tflite),
     [Model card](https://mediapipe.page.link/blazeface-back-mc)
-*   Face detection model for back-facing camera (sparse):
-    [TFLite model](https://github.com/google/mediapipe/tree/master/mediapipe/modules/face_detection/face_detection_back_sparse.tflite),
+*   Full-range model (sparse, best for faces within 5 meters from the camera):
+    [TFLite model](https://github.com/google/mediapipe/tree/master/mediapipe/modules/face_detection/face_detection_full_range_sparse.tflite),
     [Model card](https://mediapipe.page.link/blazeface-back-sparse-mc)
 
+Full-range dense and sparse models have the same quality in terms of
+[F-score](https://en.wikipedia.org/wiki/F-score) however differ in underlying
+metrics. The dense model is slightly better in
+[Recall](https://en.wikipedia.org/wiki/Precision_and_recall) whereas the sparse
+model outperforms the dense one in
+[Precision](https://en.wikipedia.org/wiki/Precision_and_recall). Speed-wise
+sparse model is ~30% faster when executing on CPU via
+[XNNPACK](https://github.com/google/XNNPACK) whereas on GPU the models
+demonstrate comparable latencies. Depending on your application, you may prefer
+one over the other.
 
 ### [Face Mesh](https://google.github.io/mediapipe/solutions/face_mesh)