profile
viewpoint

google/mediapipe 8126

MediaPipe is the simplest way for researchers and developers to build world-class ML solutions and applications for mobile, edge, cloud and the web.

mgyong/awesome-mediapipe 60

A curated list of awesome MediaPipe related code examples, libraries and software

mcclanahoochie/from_google_code 2

Automatically exported from code.google.com/p/mcclanahoochie

mcclanahoochie/digpufft 0

Automatically exported from code.google.com/p/digpufft

mcclanahoochie/grafika 0

Grafika test app

mcclanahoochie/home 0

configs and stuff in my home dir

mcclanahoochie/opencv 0

Open Source Computer Vision Library

mcclanahoochie/rttmo 0

Automatically exported from code.google.com/p/rttmo

mcclanahoochie/simple-opencl 0

Automatically exported from code.google.com/p/simple-opencl

issue closedgoogle/mediapipe

Sometimes black screen

1592231501057237

closed time in a day

weinixuehao

issue commentgoogle/mediapipe

Black screen on Samsung Galaxy S20+

can you provide the logs produced via adb?

PlugFox

comment created time in a day

issue commentgoogle/mediapipe

glClear in AnnotationOverlayCalculator

Thank you for pointing this out; we will have a look.

Did you find this out by having an issue where removing glClear resolved it? If so, can you share?

wsxiaoys

comment created time in a day

issue commentgoogle/mediapipe

face_mesh:some problem about TfLiteConverterCalculator

How/when are you measuring correctness? If I understand correctly, you are using the 'face_landmark_gpu.pbtxt' graph as-is, except replacing the model with your custom one (and updating the calculator options accordingly), is that correct? There are usually several downstream calculators that follow the inference calculator, there are other places where things could go wrong (based on the expected inference output). Are you comparing values directly from output of the inference calculator, or are you just looking at the final result?

Though, if the cpu version works with your custom model, then it could mean there is something not right for the gpu path (in the conversion, inference, or even post processing). Please update with your results using output_tensor_float_range.

Also, the comment about Android,iOS only is outdated, so we will update (there is limited support for desktop though). Do any of the default GPU examples work correctly for you? If not, please check if your system is supported.

chenti2x

comment created time in 2 days

issue commentgoogle/mediapipe

GateCalculator issue in face_landmark_front_cpu.pbtxt

see https://github.com/google/mediapipe/issues/978

chenti2x

comment created time in 3 days

issue closedgoogle/mediapipe

GateCalculator issue in face_landmark_front_cpu.pbtxt

I modify the ComputeShader (TfLiteConverterCalculator in face_landmark_gpu.pbtxt) as follows. I want to pass a grayscale image(0-255pixels) to my model(160*160). But the inference result is always wrong, could you help me aboat the problem? I work in ubuntu system with gpu.Thank you very much!

// Shader to convert GL Texture to Shader Storage Buffer Object (SSBO),
// with normalization to either: [0,1] or [-1,1].
const std::string shader_source = absl::Substitute(
    R"( #version 310 es
  layout(local_size_x = $0, local_size_y = $0) in;
  layout(binding = 0) uniform sampler2D input_texture;
  layout(std430, binding = 1) buffer Output {float elements[];} output_data;
  ivec2 width_height = ivec2($1, $2);          

  void main() {
    ivec2 gid = ivec2(gl_GlobalInvocationID.xy);
    if (gid.x >= width_height.x || gid.y >= width_height.y) return;
    vec4 pixel = texelFetch(input_texture, gid, 0);
    int linear_index = gid.y * width_height.x + gid.x;
    pixel = pixel * 255.0;
    output_data.elements[linear_index + 0] = 0.299*pixel.x + 0.587*pixel.y + 0.114*pixel.z;                    
  })",
    /*$0=*/kWorkgroupSize, /*$1=*/input.width(), /*$2=*/input.height(),
);
RET_CHECK_CALL(GlShader::CompileShader(GL_COMPUTE_SHADER, shader_source,
                                       &gpu_data_out_->shader));
RET_CHECK_CALL(GlProgram::CreateWithShader(gpu_data_out_->shader,
                                           &gpu_data_out_->program));

// ivec4 ipixel = ivec4(pixel * 255.0);  
// output_data.elements[linear_index + 0] = float((19595*ipixel.x + 38469*ipixel.y + 7472*ipixel.z)>> 16);                          
return ::mediapipe::OkStatus();

closed time in 3 days

chenti2x

issue commentgoogle/mediapipe

face_mesh:some problem about TfLiteConverterCalculator

Hi, a few questions:

  • have you also set max_num_channels: 1 in the graph options?
  • can you try using the TensorFloatRange graph option (using the original shader)?
    • set output_tensor_float_range: { min:0 max:255 }
    • (mediapipe v 0.7.7+ required)
  • there is also use_custom_normalization with custom div and sub (see the tflite_converter_calculator.proto) but the TensorFloatRange is probably easier to use.

I don't think there is anything internally preventing using non-normalized values. The defaults do expect outputs to be [0-1] or [-1-1] that covers all the models MP provides, but there are override options I mentioned above.

Can you also verify that the CPU version works? Knowing this could also help narrow down potential issues.

chenti2x

comment created time in 3 days

issue commentgoogle/mediapipe

Converting tflite tensor to imageframe

I don't think there is a calculator already made for this, but there are some existing utilities (here and here) to help you design your own calculator.

Your Process() call would look something like this (untested pseudo code):

  const auto& input_tensors =
      cc->Inputs().Tag("TENSOR").Get<std::vector<TfLiteTensor>>();
  cv::Mat tensor_mat = ConvertTfliteTensorToCvMat(input_tensors[0]);

  auto output_img = absl::make_unique<ImageFrame>(
      FORMAT, tensor_mat.cols, tensor_mat.rows);
  cv::Mat output_mat = mediapipe::formats::MatView(tensor_mat.get());
  tensor_mat.copyTo(output_mat);

  cc->Outputs()
      .Tag("IMAGE")
      .Add(output_img.release(), cc->InputTimestamp());

You would have to determine what FORMAT should be based on your incoming tensor, and since you are converting to an imageframe, it can only be one of these formats.

Hope that points you in the right direction.

maylad31

comment created time in 5 days

issue commentgoogle/mediapipe

Mediapipe High memory usage & How to remove openCV from Mediapipe.

update on memory usage in https://github.com/google/mediapipe/issues/969#issuecomment-667294219

usernexme

comment created time in 8 days

issue commentgoogle/mediapipe

High memory and 0.77 version build error.

Hi, sorry for the delay.

Here is a temporary, manual workaround:

Please replace ImageCroppingCalculator::RenderGpu function with what is below...

::mediapipe::Status ImageCroppingCalculator::RenderGpu(CalculatorContext* cc) {
  if (cc->Inputs().Tag(kImageGpuTag).IsEmpty()) {
    return ::mediapipe::OkStatus();
  }
#if !defined(MEDIAPIPE_DISABLE_GPU)
  const Packet& input_packet = cc->Inputs().Tag(kImageGpuTag).Value();
  const auto& input_buffer = input_packet.Get<mediapipe::GpuBuffer>();
  auto src_tex = gpu_helper_.CreateSourceTexture(input_buffer);

  int out_width, out_height;
  GetOutputDimensions(cc, src_tex.width(), src_tex.height(), &out_width,
                      &out_height);

  // auto dst_tex =
  //   gpu_helper_.CreateDestinationTexture(out_width, out_height); // don't use the memory pool

  mediapipe::GpuBuffer dst_buf =
      mediapipe::GpuBuffer(mediapipe::GlTextureBuffer::Create(
          out_width, out_height, input_buffer.format()));
  auto ptr = dst_buf.GetGlTextureBufferSharedPtr();
  ptr->WaitOnGpu();
  glBindTexture(ptr->target(), ptr->name());
  glBindTexture(ptr->target(), 0);
#ifdef __ANDROID__
  glBindFramebuffer(GL_FRAMEBUFFER, 0);
#endif
  GLuint fb = 0;
  glDisable(GL_DEPTH_TEST);
  glGenFramebuffers(1, &fb);
  glBindFramebuffer(GL_FRAMEBUFFER, fb);
  glViewport(0, 0, ptr->width(), ptr->height());
  glActiveTexture(GL_TEXTURE0);
  glBindTexture(ptr->target(), ptr->name());
  glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, ptr->target(),
                         ptr->name(), 0);

  // Run cropping shader on GPU.
  {
    // gpu_helper_.BindFramebuffer(dst_tex);  // GL_TEXTURE0

    glActiveTexture(GL_TEXTURE1);
    glBindTexture(src_tex.target(), src_tex.name());

    GlRender();

    glActiveTexture(GL_TEXTURE2);
    glBindTexture(GL_TEXTURE_2D, 0);
    glFlush();
  }

  // Send result image in GPU packet.
  // auto output = dst_tex.GetFrame<mediapipe::GpuBuffer>();

  auto output = absl::make_unique<mediapipe::GpuBuffer>(dst_buf);
  output->GetGlTextureBufferSharedPtr()->Updated(
      gpu_helper_.GetGlContext().CreateSyncToken());
  glBindFramebuffer(GL_FRAMEBUFFER, 0);
  glDeleteFramebuffers(1, &fb);

  cc->Outputs().Tag(kImageGpuTag).Add(output.release(), cc->InputTimestamp());

  // Cleanup
  src_tex.Release();
  // dst_tex.Release();
#endif  //  !MEDIAPIPE_DISABLE_GPU

  return ::mediapipe::OkStatus();
}

A more elegant solution is in review, and will come eventually.

The underlying issue is that there is an internal GPU memory pool/cache that is indexed on image sizes. In the hand tracking graph, the cropping calculator could potentially[usually] produce a new size image crop each frame, which fills the pool with images unlikely to be re-used. The workaround here is to skip the memory cache for cropping operation.

istoneyou

comment created time in 8 days

issue commentgoogle/mediapipe

Issue in running the face-mesh model

can you provide the log messages?

does running a video file work? (using --input_video_path=/path/to/video.mp4 )

does the face mesh cpu version work?

aashishwaikar

comment created time in 10 days

issue commentgoogle/mediapipe

Changing font

Hi,

The text rendering is done via OpenCV. OpenCV 3+ introduced custom fonts, so something like loadFontData can be added to the annotation renderer . But first you need to make sure you have the freetype module installed (either by package manager or re-compiling opencv).

So, according to that link, you could replace the current putText() call to something like:

cv::freetype::FreeType2 ft2;
ft2.loadFontData("your-font.ttf", 0);
ft2.putText(src, .... );

(though i think you would want to cache the font object and not re-load a file each frame).

Note, I have not tested this, but it should be enough to get you going!

wonjongbot

comment created time in 11 days

issue commentgoogle/mediapipe

Unable to cross-compile for aarch64

Glad you got things running.

I don't think the hand_tracking model is fully supported by coral. Also, the coral TPU requires the model to be converted to 8bit and then recompiled from that into a new edge_tpu file; I guess you have done that?

Double check the logs, because I doubt the usb stick is being used, and if it is, only a few ops are actually running on the accelerator. Unless there has been an update to the coral edge tpu compiler (which is possible), we found there are some ops used in the hand tracking/detection models that were not supported by edge TPU, then delegating those ops (an all following) to the CPU.

So, my suspicion is that the model is just running 100% on the cpu if you haven't recompiled for edge tpu, and if you have, then probably there are unsupported ops preventing running fully on TPU.

garyscetbon

comment created time in 16 days

issue commentgoogle/mediapipe

Desktop GPU run error in Docker container - An unacceptable value is specified

Based on the error I would guess the GPU is not setup correctly, or if it is, then it doesn't support OpenGL ES 3.1+ inside the container.

The docker container MediaPipe provides doesn't do anything special to enable GPU, so you may need to search how to make sure GPU is supported in the docker you create.

Related:

  • https://github.com/thewtex/docker-opengl
  • https://hub.docker.com/r/nvidia/opengl
  • https://github.com/google/mediapipe/issues/150
  • https://github.com/google/mediapipe/issues/840

...though seems like you already searched the github issues

AlexYiningLiu

comment created time in 17 days

issue commentgoogle/mediapipe

Unable to cross-compile for aarch64

Have you take a look at the coral build instructions (seems likely)? Most of that process would apply here.

Do you have the exact same opencv version installed in both places?

Does compiling just the hello_world work?

It seems to be a linker issue, so please double check that the libs are copied correctly (sometimes symlinks cause issues).

The raspberry pi 4 should be powerful enough (i.e. enough RAM) to compile directly on the device, if that is an option for you.. (the main issue is bazel needs lots of RAM. it's been done on coral 1GB devices too, by specially allocating HD space as extra ram).

garyscetbon

comment created time in 17 days

issue commentgoogle/mediapipe

Running Mediapipe on GPU with OpenGL ES < 3.0

The requirement of GLES 3.1+ mainly comes from TFLite, and the TFLite GPU delegate for inference. That would mainly be a question for TensorFlow Lite team, also I'm not familiar of any plans to do so.

There is some work being done on improving ML inference graphs/examples on machines with only OpenGL ES 2.0 or 3.0, by supporting a mix of CPU/GPU calculators: GPU preprocessing > CPU inference > CPU tensor post processing > GPU render. This would be a hybrid of the current "all CPU" or "all GPU" path calculators.

This can be an option for you right now (GPU input/output + CPU inference), but you would have to design the graph for now, while a more elegant version of that is being developed by the MP team. Just be sure to compile with this flag.

QBoulanger

comment created time in 17 days

issue commentgoogle/mediapipe

Could not find type "type.googleapis.com/mediapipe.OpenCvVideoEncoderCalculatorOptions" stored in google.protobuf.Any.

Yes, the multi_hand_tracking_gpu binary and the multi_hand_tracking_mobile.pbtxt is the correct pairing. Similarly for most all of the *_gpu desktop demos and the *_mobile.pbtxt graphs.

The naming isn't great, but GPU support for desktop is rather experimental and came much after mobile GPU support. This is something we are working to unify in the future.

laoreja

comment created time in 17 days

issue commentgoogle/mediapipe

Can't feed MacOS host webcam stream to OpenCvVideoDecoderCalculator on VM's Ubuntu via RTP

Does the VM not read the webcam directly? If so, the video decoder calc is not needed, and you can use this example to run from the camera directly.

If the webcam doesn't work in the VM (i assume that is why the video+ffmpeg hack), then you may need to edit a special case for the "infinite" frame count check here and here, to allow things to flow.

Another thing to try, is the webcam demo mentioned above also supports loading from a file, and doesn't do any header checks. So you could try that too (they both use opencv under the hood).

garyscetbon

comment created time in 17 days

issue commentgoogle/mediapipe

Prevent face detection bounding box from getting drawn

Do you want to still draw the face keypoints? Or do you want to get rid of all drawing?

Unfortunately there isn't currently an easy way to toggle just the face bounding box.

I think you could comment out both calls to SetRectCoordinate to get rid of the rectangle.

Also untested, you could try commenting out the line input_stream: "render_data" for the AnnotationOverlayCalculator node (this would remove all drawings).

The main point is that the RENDER_DATA contains all the drawing info, so you need to find where that is set and removing it (or skipping where it is drawn in the renderer).

BTW, the mutable_color()->set_r/thickness code you mention is already too late (the lines have been added, that is just properties), so if you trace backwards from there, you should find the SetRectCoordinate mentioned above to modify.

justaguyin2k20

comment created time in a month

issue commentgoogle/mediapipe

Hair Segmentation iOS Example

Hi, Thanks for the question. Unfortunately, there are no immediate plans for this, but it should happen eventually. Meanwhile, you can use a slightly modified graph on iOS:

# MediaPipe graph that performs hair segmentation with TensorFlow Lite on GPU.
# Used in the example in
# mediapipie/examples/ios/hairsegmentationgpu.

# Images on GPU coming into and out of the graph.
input_stream: "input_video"
output_stream: "output_video"

# Throttles the images flowing downstream for flow control. It passes through
# the very first incoming image unaltered, and waits for
# TfLiteTensorsToSegmentationCalculator downstream in the graph to finish
# generating the corresponding hair mask before it passes through another
# image. All images that come in while waiting are dropped, limiting the number
# of in-flight images between this calculator and
# TfLiteTensorsToSegmentationCalculator to 1. This prevents the nodes in between
# from queuing up incoming images and data excessively, which leads to increased
# latency and memory usage, unwanted in real-time mobile applications. It also
# eliminates unnecessarily computation, e.g., a transformed image produced by
# ImageTransformationCalculator may get dropped downstream if the subsequent
# TfLiteConverterCalculator or TfLiteInferenceCalculator is still busy
# processing previous inputs.
node {
  calculator: "FlowLimiterCalculator"
  input_stream: "input_video"
  input_stream: "FINISHED:hair_mask"
  input_stream_info: {
    tag_index: "FINISHED"
    back_edge: true
  }
  output_stream: "throttled_input_video"
}

# Transforms the input image on GPU to a 512x512 image. To scale the image, by
# default it uses the STRETCH scale mode that maps the entire input image to the
# entire transformed image. As a result, image aspect ratio may be changed and
# objects in the image may be deformed (stretched or squeezed), but the hair
# segmentation model used in this graph is agnostic to that deformation.
node: {
  calculator: "ImageTransformationCalculator"
  input_stream: "IMAGE_GPU:throttled_input_video"
  output_stream: "IMAGE_GPU:transformed_input_video"
  options: {
    [drishti.ImageTransformationCalculatorOptions.ext] {
      output_width: 512
      output_height: 512
    }
  }
}

# Caches a mask fed back from the previous round of hair segmentation, and upon
# the arrival of the next input image sends out the cached mask with the
# timestamp replaced by that of the input image, essentially generating a packet
# that carries the previous mask. Note that upon the arrival of the very first
# input image, an empty packet is sent out to jump start the feedback loop.
node {
  calculator: "PreviousLoopbackCalculator"
  input_stream: "MAIN:throttled_input_video"
  input_stream: "LOOP:hair_mask"
  input_stream_info: {
    tag_index: "LOOP"
    back_edge: true
  }
  output_stream: "PREV_LOOP:previous_hair_mask"
}

# Embeds the hair mask generated from the previous round of hair segmentation
# as the alpha channel of the current input image.
node {
  calculator: "SetAlphaCalculator"
  input_stream: "IMAGE_GPU:transformed_input_video"
  input_stream: "ALPHA_GPU:previous_hair_mask"
  output_stream: "IMAGE_GPU:mask_embedded_input_video"
}

# Converts the transformed input image on GPU into an image tensor stored in
# tflite::gpu::GlBuffer. The zero_center option is set to false to normalize the
# pixel values to [0.f, 1.f] as opposed to [-1.f, 1.f].
# With the max_num_channels option set to 4, all 4 RGBA channels are contained
# in the image tensor.
node {
  calculator: "TfLiteConverterCalculator"
  input_stream: "IMAGE_GPU:mask_embedded_input_video"
  output_stream: "TENSORS_GPU:image_tensor"
  options: {
    [drishti.TfLiteConverterCalculatorOptions.ext] {
      zero_center: false
      max_num_channels: 4
    }
  }
}

# Generates a single side packet containing a TensorFlow Lite op resolver that
# supports custom ops needed by the model used in this graph.
node {
  calculator: "TfLiteCustomOpResolverCalculator"
  output_side_packet: "op_resolver"
  options: {
    [drishti.TfLiteCustomOpResolverCalculatorOptions.ext] {
      use_gpu: true
    }
  }
}

# Runs a TensorFlow Lite model on GPU that takes an image tensor and outputs a
# tensor representing the hair segmentation, which has the same width and height
# as the input image tensor.
node {
  calculator: "TfLiteInferenceCalculator"
  input_stream: "TENSORS_GPU:image_tensor"
  output_stream: "TENSORS:segmentation_tensor"
  input_side_packet: "CUSTOM_OP_RESOLVER:op_resolver"
  options: {
    [drishti.TfLiteInferenceCalculatorOptions.ext] {
      model_path: "third_party/mediapipe/models/hair_segmentation.tflite"
      use_gpu: true
    }
  }
}

# The next step (tensors to segmentation) is not yet supported on iOS GPU.
# Convert the previous segmentation mask to CPU for processing.
node: {
  calculator: "GpuBufferToImageFrameCalculator"
  input_stream: "previous_hair_mask"
  output_stream: "previous_hair_mask_cpu"
}

# Decodes the segmentation tensor generated by the TensorFlow Lite model into a
# mask of values in [0.f, 1.f], stored in the R channel of a CPU buffer. It also
# takes the mask generated previously as another input to improve the temporal
# consistency.
node {
  calculator: "TfLiteTensorsToSegmentationCalculator"
  input_stream: "TENSORS:segmentation_tensor"
  input_stream: "PREV_MASK:previous_hair_mask_cpu"
  output_stream: "MASK:hair_mask_cpu"
  options: {
    [drishti.TfLiteTensorsToSegmentationCalculatorOptions.ext] {
      tensor_width: 512
      tensor_height: 512
      tensor_channels: 2
      combine_with_previous_ratio: 0.9
      output_layer_index: 1
    }
  }
}

# Send the current segmentation mask to GPU for the last step, blending.
node: {
  calculator: "ImageFrameToGpuBufferCalculator"
  input_stream: "hair_mask_cpu"
  output_stream: "hair_mask"
}

# Colors the hair segmentation with the color specified in the option.
node {
  calculator: "RecolorCalculator"
  input_stream: "IMAGE_GPU:throttled_input_video"
  input_stream: "MASK_GPU:hair_mask"
  output_stream: "IMAGE_GPU:output_video"
  options: {
    [drishti.RecolorCalculatorOptions.ext] {
      color { r: 0 g: 0 b: 255 }
      mask_channel: RED
    }
  }
}

While this is not ideal (it introduces 2 cpu/gpu transfers), it still should be faster than a cpu only pipeline (also an option).

purpleblues

comment created time in a month

issue commentgoogle/mediapipe

face_mesh_gpu fails to run the graph on Ubuntu via ssh

Hi you will not be able to run the example on GPU if compiled with DISABLE_GL_COMPUTE (required for inference) ; you either need to use all CPU or all GPU graph (so compile with --define MEDIAPIPE_DISABLE_GPU=1 for all CPU , or use the command in "2nd try" for all GPU ).

if what you have listed above is all that glxinfo produces, then the GPU support is not available on your system.

Can you verify that you have a supported GPU and drivers installed?

lackhole

comment created time in a month

PR closed google/mediapipe

In normal case channel order is NHWC

The original setting will result in error when input height != width

+2 -2

2 comments

1 changed file

brucechou1983

pr closed time in a month

pull request commentgoogle/mediapipe

In normal case channel order is NHWC

Hi, Thanks for your contribution. This change is not strictly necessary, as the ordering of options defined in .proto files (and their field ID numbers) doesn't impact how they are used or specified in a MediaPipe graph (.pbtxt). Re-assigning the field ID numbers here will not change the ordering, and it may actually break existing graphs. Please see here for more info about how .proto files are defined (and the reason for unique ID numbers). If you find there is something not working with your segmentation mode (when width!=height), and swapping these values in your graph works, then there may be an issue elsewhere in the tensors to segmentation calculator, but it would not be fixed here. Thanks,

brucechou1983

comment created time in a month

issue commentgoogle/mediapipe

Failed to run the graph

The error message says the demo cannot initialize the camera or load the video file.

If you have a webcam, please make sure it is connected and works via other apps first.

If you are loading a video, please make sure the full path is used when running with this flag --input_video_path=/full/path/to/video.mp4

Chandler-Shuyu-Xie

comment created time in a month

issue commentgoogle/mediapipe

Errors when run hand_detection demo on snapdragon845, modified from desktop demo

Hi, the build flag you are using is compiling for ARM cpu, but also android OS. To only compile for ARM (with linux OS), try replacing the --config=android_arm64 flag with --cpu=aarch64

If your platform is indeed Android, then you will need extra code inside the demo_run_graph_main_gpu.cc to register the android asset manager with the main activity's context.

zhenxingNreal

comment created time in a month

issue commentgoogle/mediapipe

crash while runing desktop/example/hand_tracking

Hi, unfortunately this looks like an issue with how the virtual machine handles the GPU. I would search around for VMware GPU support, or see if you can dual boot into linux.

VeitchKyrie

comment created time in a month

issue commentgoogle/mediapipe

crash while runing desktop/example/hand_tracking

Hi, based on your logs:

I20200702 10:19:40.635076 22080 gl_context.cc:324] GL version: 3.0 (OpenGL ES 3.0 Mesa 18.0.5) I20200702 10:19:40.635179 22054 demo_run_graph_main_gpu.cc:67] Initialize the camera or load the video. [ WARN:0] VIDEOIO ERROR: V4L: can't open camera by index 0

I see two issues

  1. your GPU is not supported, because mediapipe requires OpenGL ES 3.1+
  2. the code is attempting to open a webcam and can't

While # 2 can be fixed by supplying a path to a video file , the # 1 issue is a deal breaker.

You can try using the CPU version of the hand demo, and supply a video path (instead of defaulting to webcam) via --input_video_path=/path/to/file.mp4

VeitchKyrie

comment created time in a month

issue commentgoogle/mediapipe

Cannot build hand_tracking_gpu model with mediapipe docker : fatal error: EGL/egl.h: No such file or directory

to be honest, i've never personally tried gpu examples on a docker, so i can't speak from experience, but other users seem to have success. with that being said:

the 'unable to open display' error come from a few different places...

try running $ cd /tmp/.X11-unix && for x in X*; do echo ":${x#X}"; done ; cd - and see which DISPLAY variables are set; should print something like

:0
:20

then run the container with DISPLAY=:0 or DISPLAY=:20 (or whatever # you have)

see https://github.com/google/mediapipe/issues/317 (also linked from issue #150 above)

see if it's possible to start an X11 server inside the docker, via startx -- :1 & # startx X on display :1 then run your mediapipe command export DISPLAY=:1 && <cmd>

romainvo

comment created time in a month

issue commentgoogle/mediapipe

Running an example in docker image fails with EGL/egl.h not found.

https://github.com/google/mediapipe/issues/846

lackhole

comment created time in a month

issue commentgoogle/mediapipe

Multihand tracking using GPU in Ubuntu terminates abruptly with signal SIGSEGV (Address boundary error)

which gpu do you have? how much ram does your gpu have? the hand tracking examples use quite a bit of gpu memory (something that is being worked on resolving).

does the unmodified example show this issue?

do the other gpu examples show the same issue? (face detection, single hand tracking, face mesh)?

increases with each additional increase in code complexity where is this new code?... in the demo runner, or are you adding/modifying calculators? how is new cpu/gpu memory being managed?

prantoran

comment created time in a month

issue commentgoogle/mediapipe

Cannot build hand_tracking_gpu model with mediapipe docker : fatal error: EGL/egl.h: No such file or directory

Can you confirm you have followed these instructions to check for gpu support ? https://google.github.io/mediapipe/getting_started/gpu_support.html#opengl-es-setup-on-linux-desktop

Also see https://github.com/google/mediapipe/issues/148

romainvo

comment created time in a month

issue commentgoogle/mediapipe

How to efficiently convert tensor to cv::Mat

Looks like your options are: -to write a compute shader to process the gpu tensor (similar to the code in TensorsToDetectionsCalculator), or -to have the Inference calculator output directly to CPU tensor (potentially more efficiently than the tensor.Read).

For the former, you will need to research in other places to learn about writing opengl shaders for your task.

To do the latter, simply change output_stream: "TENSORS_GPU:tensors" to output_stream: "TENSORS:tensors" for the inference calculator, then you will have a TfLiteTensor result which you can follow a similar approach to wrap as a cv::Mat cv::Mat hed_tensor_mat( 1, hed_buf.size(), CV_32FC1, input_tensors[0]->data.f); NOTE: make sure you keep the input_stream as gpu

weinixuehao

comment created time in 2 months

issue commentgoogle/mediapipe

How to efficiently convert tensor to cv::Mat

There is no direct conversion between GpuTensor and cv::Mat. You will need to download the data from GPU to CPU (via tensor.Read()). You can create a cv::Mat from a vector without deep copy: cv::Mat hed_tensor_mat( 1, hed_buf.size(), CV_32FC1, hed_buf.data()); This would simple wrap the Mat around the std::vector data (vector still owns the data). (you may need to call "reshape()" for your tensor dimesntions: hed_tensor_mat = hed_tensor_mat.reshape(channels,rows)

weinixuehao

comment created time in 2 months

issue closedgoogle/mediapipe

image preprocess

Hello, I would like to know the preprocess you do before network inference. Is it img /255 or img/255 - 1? Or is there other preprocess? THX.

closed time in 2 months

JR-Wang

issue commentgoogle/mediapipe

image preprocess

Closing due to lack of activity

JR-Wang

comment created time in 2 months

issue commentgoogle/mediapipe

Sometimes black screen

i had a quick look:

one issue i see is never setting your input texture uniform. your code glUniform1i(glGetUniformLocation(program_, "input_frame"), 1); and shader uniform sampler2D previewtexture; do not match; they should both be named the same (both "input_frame" or both "previewtexture")

also your "preview_tex" texture doesn't seem to be used, and probably isn't necessary, because you are creating a destination texture and binding it via the gpu_helper.

also please utilize the visualizer and make sure your graph connections look correct.

weinixuehao

comment created time in 2 months

issue commentgoogle/mediapipe

Sometimes black screen

Can you provide a full LogCat trace? Do any other examples do this? What is the exact phone model and Android version?

weinixuehao

comment created time in 2 months

issue commentgoogle/mediapipe

How to convert tensors to gpubuffer for render on android ?

Generally the flow of MP examples is: Input > convert to Tensor > Tensor Inference > Tensors to X conversion > X to Render Data conversion > AnnotationOverlayCalculator (accepts render data). Please see the selection of 'tflite_tensors_to_x' calculators here and 'x to render data' calculators here see if one of those fits your needs. If not, you will need to create a calculator to convert your tensor data into render data for the AnnotationOverlayCalculator.

If your output tensor is literally an rgb image, you will need to write a calculator to convert it to mediapiep::ImageFrame , then you can use ImageFrameToGpuBufferCalculator

weinixuehao

comment created time in 2 months

issue commentgoogle/mediapipe

How to change hair color in real time when running hairsegmentationgpu in Android?

Please see https://github.com/google/mediapipe/issues/420 and https://github.com/google/mediapipe/issues/18

mmm2016

comment created time in 2 months

issue commentgoogle/mediapipe

Read docs for mediapipe is not working

Hi, please use https://google.github.io/mediapipe/getting_started/building_examples.html

it seems there may be some links broken during the doc transition to the new system at https://google.github.io/mediapipe/

Somyarani1113

comment created time in 2 months

issue commentgoogle/mediapipe

build error using Coral

Can you please make sure you have the latest release (or master) of MediaPipe? Please double check you have followed the coral readme closely, and have configured the docker to have python , python-pip , python3-pip , python-numpy packages installed (should have been done automatically). Does building the hello_world example work (as per coral readme)? Also try running bazel clean --expunge before building again.

craston

comment created time in 2 months

issue commentgoogle/mediapipe

Handtracking with real 3D coordinate based on camera coordinate system?

There is a related discussion here https://github.com/google/mediapipe/issues/742#issuecomment-639104199

SuperSaiyan30

comment created time in 2 months

issue commentgoogle/mediapipe

Hand tracking landmarks - Z value range

The hand model uses "scaled orthographic projection" (or, weak perspective), with some fixed average depth (Z avg).

Weak-perspective projection is an orthographic projection plus a scaling, which serves to approximate perspective projection by assuming that all points on a 3D object are at roughly the same distance from the camera.

The justification for using weak-perspective is that in many cases it approximates perspective closely. In particular for situations when the average variation of the depth of the object (delta Z) along the line of sight is small, compared to the fixed average depth (Z avg). This also allows objects at a distance not to distort due to perspective, but to only uniformly scale up/down.

The z predicted by the model is relative depth, based on the Zavg of "typical hand depth" (in the case of holding a phone with one hand and the other is tracked, or being close to the phone and showing both hands). Also, the range of z is unconstrained, but it is scaled proportionally along with x and y (via weak projection), and expressed in the same units as x & y.

There is a root landmark point (wrist) that all the other landmark depths are relative to (again normalized via weak projection w.r.t. x & y).

Tectu

comment created time in 2 months

issue commentgoogle/mediapipe

Using other models

Almost all MediaPipe examples follow a simple general template/flow:

  1. Input image
  2. Preprocess input image (image->tensor)
  3. ML model inference (tensor->tensor)
  4. Post process tensors (tensor->metadata)
  5. Render (metadata->image)

The closest example to YOLO and SSD we have (and probably simplest ML example) is object detection Have a look at the post processing calculators in particular, like TfLiteTensorsToDetectionsCalculator That is where the object detection model output is processed (for that example)

Zumbalamambo

comment created time in 2 months

issue commentgoogle/mediapipe

How to obtain unnormalized landmark coordinates?

The beauty of normalized coordinates is that they [generally] do not care what size of output is used. Using getX()*width & getY()*height should be plenty accurate.

Let's say you have a hand coordinate, in pixel space, of (123,123) with your 1080 x 1440 image, and it moves 1 pixel up (123,124): In normalized coordinates this would be 123/1440.0 = 0.0854166666666666 124/1440.0 = 0.08611111111111111 meaning there is a difference of abs(0.08541666666666667 - 0.08611111111111111) = 0.000694444444444442 which floating point can easily handle, even more accurate than integer coordinates, allowing for sub-pixel accuracy.

AlexYiningLiu

comment created time in 2 months

issue commentgoogle/mediapipe

Recolour the mask in hair segmentation model using LUTs

@Nerdyvedi I guess you can call it 3D lut (i was mainly referring to using a 2D texture as LUT instead of simpler 1D). Regardless, you can pick a color channel, use that value to index into the big grid (choosing one of the 8x8 squares) , then use the other two channels to index (x,y) the sub-square

@jdmakwana1999 That is a comment (taken from the GPU code) that describes what the cpu is doing (the GPU code was written first).

Nerdyvedi

comment created time in 2 months

issue commentgoogle/mediapipe

Build error when Cross compiling for Coral, xnnpack_delegate.h: No such file or directory

Hi, can you confirm you have the most recent version of MediaPipe? This should have been addressed in 0.7.5 (and HEAD) . If you are not updated, and don't want to update, then any xnn references can be commented out (irrelevant on Coral anyway).

gustavryd

comment created time in 2 months

issue commentgoogle/mediapipe

Build error when cross compiling for Coral

Interesting, I thought this was fixed in the newest release of MediaPipe (more specifically, the newer GLog). You can safely change return (void*)context->PC_FROM_UCONTEXT; with return NULL;

gustavryd

comment created time in 2 months

issue commentgoogle/mediapipe

image preprocess

Hi, It depends on what the model expects. Some infrence models expect data normalized between [0 1], and some use [-1 1], and fixed point models use their own quantization. Have a look at the TfLiteConverterCalculator and its uses in graphs, as this is where any normalization happens. Also, before that calculator is run, there is usually a ImageTransformationCalculator for resizing an image to what a model expects (e.g. 256x256, etc).

JR-Wang

comment created time in 3 months

issue commentgoogle/mediapipe

Ios - multihand tracker crashes after dealloc

I believe this issue may be resolved in the next release. It has to do with needing to add a private namespace around the declaration of GPUData in both tflite_converter_calculator.cc and tflite_inference_calculator.cc You can try adding that, or wait until the next push, likely early next week.

SvyatHoly

comment created time in 3 months

issue commentgoogle/mediapipe

Recolour the mask in hair segmentation model using LUTs

then uses the luminance (grayscale) to blend between the 2 LUT results.

as mentioned above, it is linear interpolation

Nerdyvedi

comment created time in 3 months

issue commentgoogle/mediapipe

coral face model with usb accelerator

Great! I'm not too familiar with the python api, but I it expects the model to have a certain final op that does some filtering. The face model MP provides doesn't have this final TF op, and instead relies on a calculator to do the filtering [see here]. Right below that section in the calculator, you can see the 4-output version, which I believe the python code expects. You would need to either choose a different model, or adapt the python code to handle the 2-tensor version (as seen in the calculator).

natxopedreira

comment created time in 3 months

issue commentgoogle/mediapipe

Why the outputs of the resize function using ImageTransformationCalculator different for IMAGE (CPU) and IMAGE_GPU (GPU)?

What is the input dimensions of the image?

I think one problem could be indexing the Mat. Instead of using input[(i*input_mat.rows + j) * 4 + 0] try using input.at<uchar>(i,j) This is for 2 reasons: -because there may be padding on the image (the width may be larger than input_mat.cols). -you are using y*height+xand it should be y*width+x or x*height+y

yxchng

comment created time in 3 months

issue closedgoogle/mediapipe

In android,save bitmap,not color

I use the android, added on my MainActivity:

Processor. AddPacketCallback (

"Input_video_cpu."

(packet) - > {

Bitmap Bitmap = AndroidPacketGetter. GetBitmapFromRgb (packet); };

The saved bitmap has no color, but it has the effect of a fence, and the size also takes up the width * height *3, I want to know, how is this going on, and how does it look color, is it not rgba that it is saved? Looking forward to your reply. Thank you

closed time in 3 months

teleger

issue commentgoogle/mediapipe

coral face model with usb accelerator

Do you have the latest edge tpu runtime and tflite library?

natxopedreira

comment created time in 3 months

issue commentgoogle/mediapipe

Recolour the mask in hair segmentation model using LUTs

Hi, The current recolor calculator is a simplified version of what is in the paper. The algorithm in the paper computes the luminance (grayscale) value at each hair pixel, then uses that value (0-255) as an index into a LUT, and uses the color from LUT to overlay onto the image. The recolor calculator provided skips the LUT, and just uses the luminance multiplied by the output color. So, if you can create a LUT, then inserting it into the calculator would not be very difficult.

Nerdyvedi

comment created time in 3 months

issue commentgoogle/mediapipe

Are ImageFrameToGpuBufferCalculator and GpuBufferToImageFrameCalculator only working on mobile? I get segmentation fault when trying to use it on desktop.

The alignment is automatic, and not something normally to be of concern. I just mention it as a caution, because depending on the sizes of your images, you can't assume the data is contiguous (on CPU), which is something that can be overlooked if you are having issues.

yxchng

comment created time in 3 months

fork mcclanahoochie/grafika

Grafika test app

fork in 3 months

issue commentgoogle/mediapipe

Custom tensorflow model in mediapipe confusion/error

Hi, I would try to model how the youtube8m example uses tensorflow calculators in the graph and dependencies in the BUILD files: binary , graphs and see what dependencies or calculator options are missing (relevant to your project)

pablovela5620

comment created time in 3 months

issue commentgoogle/mediapipe

Why the outputs of the resize function using ImageTransformationCalculator different for IMAGE (CPU) and IMAGE_GPU (GPU)?

in which way are outputs different? Can you please show what you are seeing?

yxchng

comment created time in 3 months

issue commentgoogle/mediapipe

coral face model with usb accelerator

The edge TPU models MediaPipe provides were compiled for version 12 of Mendel (the coral OS). Please flash the board to the latest version 13, or version 12.

natxopedreira

comment created time in 3 months

issue commentgoogle/mediapipe

In android,save bitmap,not color

What is the rest of the graph look like? In the example Android graphs from MediaPipe, RGBA images are used throughout the pipeline. Have you also tried getBitmapFromRgba() ?

teleger

comment created time in 3 months

issue commentgoogle/mediapipe

Are ImageFrameToGpuBufferCalculator and GpuBufferToImageFrameCalculator only working on mobile? I get segmentation fault when trying to use it on desktop.

All GPU calculators, including the ImageFrame/GpuBuffer conversion calculators, can run on GPU in linux, if your gpu is supported.

yxchng

comment created time in 3 months

issue commentgoogle/mediapipe

Wouldn't it be a better design if we expose the div and sub values in NormalizeImage function in tflite_converter_calculator.cc as options?

Hi, All floating point models MediaPipe uses expects values either from [-1,1] or [0,1]. If using a quantized model, no normalization is done (input is assumed to be 8bit already). If your model expects floating point values between [x,y], I would suggest writing a simple multiply calculator that follows the converter calc. This request has been noted, but alternatively, adding 'div' and 'sub' options to LoadOptions() wouldn't be that hard either for a user to do. Thanks for the update.

yxchng

comment created time in 3 months

more