profile
viewpoint

awangk/ujson 3

µjson is a a small, C++11, UTF-8, JSON library

awangk/awesome-cpp 0

A curated list of awesome C/C++ frameworks, libraries, resources, and shiny things. Inspired by awesome-... stuff.

awangk/folly 0

An open-source C++ library developed and used at Facebook.

awangk/glslang 0

Khronos reference front-end for GLSL and ESSL

awangk/ispc 0

Intel SPMD Program Compiler

awangk/microprofile 0

microprofile is an embeddable profiler

awangk/nativejson-benchmark 0

C/C++ JSON parser/generator benchmark

awangk/ninja 0

a small build system with a focus on speed

awangk/spdlog 0

Super fast C++ logging library.

awangk/stb 0

stb single-file public domain libraries for C/C++

issue closedhalide/Halide

the option "-s" for gengen is not longer working

I found that, after upgrading Halide to 574b6bb7dd27c95641f72eb05a0dd187f46ee494 The option "-s" for gengen is not longer working

bin/my_gen  -g test_schedule_gen -f test auto_schedule=true -e c_header,registration,schedule,c_source,assembly,stmt,llvm_assembly,bitcode,object target=x86-64-linux-avx2-no_runtime-disable_llvm_loop_opt-no_asserts-no_bounds_query machine_params=3,1000000,40 -p ../../Halide-build/apps/autoscheduler/libauto_schedule.so -s Adams2019 -o bin

return with the error as follows

: CommandLine Error: Option 'pm-max-devirt-iterations' registered more than once!
LLVM ERROR: inconsistency in registered CommandLine options

thing works fine by giving the specific autoscheduler's so to -p

bin/my_gen  -g test_schedule_gen -f test auto_schedule=true -e c_header,registration,schedule,c_source,assembly,stmt,llvm_assembly,bitcode,object target=x86-64-linux-avx2-no_runtime-disable_llvm_loop_opt-no_asserts-no_bounds_query machine_params=3,1000000,40 -p  ../../bin/libautoschedule_adams2019.so -o bin

Perhaps, the help message for gengen should be fixed accordingly.

gengen 
  [-g GENERATOR_NAME] [-f FUNCTION_NAME] [-o OUTPUT_DIR] [-r RUNTIME_NAME] [-d 1|0]
  [-e EMIT_OPTIONS] [-n FILE_BASE_NAME] [-p PLUGIN_NAME] [-s AUTOSCHEDULER_NAME]
       target=target-string[,target-string...] [generator_arg=value [...]]

 -d  Build a module that is suitable for using for gradient descent calculationn
     in TensorFlow or PyTorch. See Generator::build_gradient_module() documentation.

 -e  A comma separated list of files to emit. Accepted values are:
     [assembly, bitcode, c_header, c_source, cpp_stub, featurization,
      llvm_assembly, object, python_extension, pytorch_wrapper, registration,
      schedule, static_library, stmt, stmt_html, compiler_log].
     If omitted, default value is [c_header, static_library, registration].

 -p  A comma-separated list of shared libraries that will be loaded before the
     generator is run. Useful for custom auto-schedulers. The generator must
     either be linked against a shared libHalide or compiled with -rdynamic
     so that references in the shared library to libHalide can resolve.
     (Note that this does not change the default autoscheduler; use the -s flag
     to set that value.)
 -r   The name of a standalone runtime to generate. Only honors EMIT_OPTIONS 'o'
     and 'static_library'. When multiple targets are specified, it picks a
     runtime that is compatible with all of the targets, or fails if it cannot
     find one. Flags across all of the targets that do not affect runtime code
     generation, such as `no_asserts` and `no_runtime`, are ignored.

 -s  The name of an autoscheduler to set as the default.

closed time in 7 minutes

benzwt

issue commenthalide/Halide

Adjoints from Generators with Scalar Parameters

This was raised again in #5406.

clutzweiler

comment created time in 9 minutes

PR opened microsoft/DirectXShaderCompiler

Updates to RayTracing semantics
  1. Add OpTraceRayKHR
  2. Remove "provisional" in files
  3. Update the SPIRV Headers/Tools to latest version
+104 -84

0 comment

28 changed files

pr created time in 9 minutes

issue closedhalide/Halide

Internal compiler error when differentiating generators with scalar parameters.

Make a generator with a scalar input (the example generator in test/generators works) and try to take the derivative

abadams@anadams-work:~/projects/Halide_autodiff_generator_scalar_bug
$ make bin/example.generator
make: 'bin/example.generator' is up to date.

abadams@anadams-work:~/projects/Halide_autodiff_generator_scalar_bug
$ bin/example.generator -g example -o . target=host -d 1
Internal Error at /home/abadams/projects/Halide_autodiff_generator_scalar_bug/src/Generator.cpp:1544
Condition failed: input->parameters_.size() == input->funcs_.size():
Aborted (core dumped)

I imagine the issue is that in the reverse pipeline that scalar input needs to turn into a scalar output, which could be done as a zero-dim buffer but isn't. We at least need an error message here, but ideally we'd turn them into zero-dim buffer outputs.

Not sure who did the autodiff support for generators. Maybe @BachiLi ?

closed time in 10 minutes

abadams

issue closedhalide/Halide

OpenCL race condition with TailStrategy::ShiftInwards?

When generating tiled OpenCL code, Halide generates code which I believe has a race condition (multiple concurrent threads may write to the same output element). I do not see how this is legal according to the OpenCL specification and believe atomic stores should be used for correctness, or am I mistaken? The program does seem to consistently produce the correct output on my machine.

Small example:

output(x) = input(x) + 2 * input(x+1) + input(x+2);
output.gpu_tile(x, xi, 32); // defaults to TailStrategy::ShiftInwards

Intermediate statements extract:

  let t15 = (output.extent.0 + 31)/32
  let t16 = output.extent.0/32
  let t18 = (output.extent.0 + output.min.0) - input.min.0
  let t17 = output.min.0 - input.min.0
  gpu_block<OpenCL> (output.s0.x.x.__block_id_x, 0, t15) {
   gpu_thread<OpenCL> (.__thread_id_x, 0, 32) {
    if (output.s0.x.x.__block_id_x < t16) {
     let t13 = ((output.s0.x.x.__block_id_x*32) + t17) + .__thread_id_x
     output[(output.s0.x.x.__block_id_x*32) + .__thread_id_x] = input[t13 + 2] + (input[t13] + (input[t13 + 1]*2.000000f))
    } else {
     let t14 = .__thread_id_x + t18
     output[(.__thread_id_x + output.extent.0) + -32] = input[t14 + -30] + (input[t14 + -32] + (input[t14 + -31]*2.000000f))
    }
   }
  }

OpenCL kernel extract:

__kernel void kernel_output_s0_x_x___block_id_x(
 __address_space__input const float *restrict _input,
 __address_space__output float *restrict _output,
 // [...]
 __local int16* __shared)
{
 int _output_s0_x_x___block_id_x = get_group_id(0);
 int ___thread_id_x = get_local_id(0);
 bool _0 = _output_s0_x_x___block_id_x < _t16;
 if (_0)
 {
  int _1 = _output_s0_x_x___block_id_x * 32;
  int _13 = _1 + ___thread_id_x;
  // [...]
  _output[_13] = _12;
 } // if _0
 else
 {
  // [...]
  int _25 = ___thread_id_x + _output_extent_0;
  int _26 = _25 + -32;
  _output[_26] = _24;
 } // if _0 else
} // kernel kernel_output_s0_x_x___block_id_x

closed time in 11 minutes

Bastacyclop

issue closedhalide/Halide

API `device_wrap_native` is not usable for OpenCL JIT target.

Halide's Buffer has the method device_wrap_native. In case of OpenCL it supposed to be used for wrapping externally created cl_mem object. But it is not possible to create cl_mem object in case of JIT because cl_context is not available and halide_acquire_cl_context can't be overridden. So, the questions is: is it possible to retrieve cl_context(and cl_command_queue) or provide it for Halide JIT?

Thanks!

closed time in 12 minutes

magpier84

issue closedhalide/Halide

Runtime autoschedule

I would like to run the auto-schedule at runtime for a Func not known at the compilation time.

For that I design a function:

class AutoScheduledDynamic : public Generator<AutoScheduledDynamic>
{
public:
	static GeneratorInput<Buffer<f32>>	oInput;
	static GeneratorOutput<Buffer<f32>>	oOutput;

	void generate()
	{
		oStart();
		Func oPassthrough{ "Passthrough" };
		oPassthrough( x, y, c ) = oInput( x, y, c );
		Func oOperation = oFunc( oPassthrough );
		oOutput( x, y, c ) = oOperation( x, y, c );
		oEnd();
	}

	void schedule()
	{
		if (auto_schedule)
		{
			oInput.set_estimates(oRegInput);
			oOutput.set_estimates(oRegOutput);
		}
	}

	static Region oRegInput;
	static Region oRegOutput;

	static std::function<Func(Func)> oFunc;
	static std::function<void()> oStart;
	static std::function<void()> oEnd;

	static Var x, y, c;
};

Where oFunc is defined on the code. We can imagine an stack of operations for image processing stacked by a user behind a UI. Like: Blur > Resize > ... > Sharpen And we want to perform an autoschedule to have an overall optimization.

At the runtime level we do:

AutoScheduledDynamic::oRegInput = { {0, iWidth}, {0, iHeight}, {0, iChannel} };
AutoScheduledDynamic::oRegOutput = { {0, iWidth}, {0, iHeight}, {0, iChannel} };

auto ctx = Halide::GeneratorContext{ target, true, Halide::MachineParams{ GetCore(), 16777216, GetMemoryScale() } };


AutoScheduledDynamic::oStart = [iChannel]() {
	AutoScheduledDynamic::oInput.dim(0)
		.set_stride(iChannel)
		.dim(2)
		.set_stride(1)
		.set_extent(iChannel);
};
AutoScheduledDynamic::oFunc = &HERE_THE_OPERATIONS_STACKED;
AutoScheduledDynamic::oEnd = [iChannel]() {
	AutoScheduledDynamic::oOperation.reorder(
						AutoScheduledDynamic::c,
						AutoScheduledDynamic::x,
						AutoScheduledDynamic::y);
	AutoScheduledDynamic::oOperation.output_buffer()
		.dim(0)
		.set_stride(iChannel)
		.dim(2)
		.set_stride(1)
		.set_extent(iChannel);
};

To replace the macro:

auto factory = [](const Halide::GeneratorContext& context) -> std::unique_ptr<Halide::Internal::GeneratorBase> {
	return AutoScheduledDynamic::create(context, "gen_name", "gen_name");
};
Halide::Internal::RegisterGenerator reg_gen_name = Halide::Internal::RegisterGenerator("gen_name", factory);

As a toy-test I used: Halide::Internal::generate_filter_main(18, const_cast<char**>(&values[0]), std::cout);

Nevertheless I have: Unhandled exception: Error: Must use Output<> with generate() method.

I tried to add a dummy non-static GeneratorOutput<> which fix that. I tried to check the code, wasn't able to find a solution. Do we have a comment with my approch? Or a solution?

closed time in 12 minutes

chkone

issue closedhalide/Halide

Implement the ping-pong buffer in Halide

Hi all,

Now, we use halide to implement some cv algorithm in Huawei DaVinci arch chip. When we want to implement ping-pong buffer, we're having some trouble.

DaVinci arch have different pileline, such as MTE2(move data from global memory to on-chip memory), VECTOR(do data compute, SIMD), MTE3(move data from on-chip memory to global memory). The different pipeline need to be synchronized. One way to improve the performance of algorithm is use ping-pong buffer that can implement the parallelism between different pipelines.

We have implement the single buffer in Halide. the IR like this:

for (i0, 0, i0.extent) {
     copy_data_in(addr1_onchip,  addr1_gm);
     vector_xxx(addr1_onchip);
     ...;
     copy_data_out(addr2_gm, addr2_onchip);
}

We want the ping-pong buufer IR like this:

for (i0, 0, i0.extent / 2) {
     // ping buffer
     copy_data_in(addr1_1_onchip,  addr1_gm);
     vector_xxx(addr1_1_onchip);
     ...;
     copy_data_out(addr2_gm, addr2_1_onchip);

     // pong buffer
     copy_data_in(addr1_2_onchip,  addr1_gm + offset);
     vector_xxx(addr1_2_onchip);
     ...;
     copy_data_out(addr2_gm + offset, addr2_2_onchip);
}

The pipeline changes is like this:

image

I can't get this ir by the schedule. Do your guys have any ideas?

closed time in 13 minutes

Vernlium

issue closedhalide/Halide

OpenCL backend error

I'm using Halide prebuilt binaries and trying to use a POCL device for OpenCL backend. I am passing the env variables as below and invoking share/Halide/tutorial/lesson_12_using_the_gpu

HL_OCL_PLATFORM_NAME="Portable Computing Language"
HL_OCL_DEVICE_TYPE="acc"

I get the below error message.

Running pipeline on GPU:
JIT compiling opencl for x86-64-linux-avx-avx2-f16c-fma-jit-opencl-sse41
: CommandLine Error: Option 'help-list' registered more than once!
LLVM ERROR: inconsistency in registered CommandLine options

closed time in 14 minutes

harishch4

issue closedhalide/Halide

Crash in park() method

Hello there,

We're experiencing a crash in our solution, and it seems to be happening somewhere inside Halide::Runtime::Internal::Synchronization::park() method.

Can anybody help me out? Perhaps on how to further investigate in order to get to the bottom of it. Or maybe tell me if this is some known issue in Halide library.

My tombstone info looks like this:

backtrace:
      #00 pc 000000000004b38c  /apex/com.android.runtime/lib64/bionic/libc.so (syscall+28) (BuildId: 41c660c694a41af9265f00d2b0edc3)
      #01 pc 000000000004ef74  /apex/com.android.runtime/lib64/bionic/libc.so (__futex_wait_ex(void volatile*, bool, int, bool, timespec const*)+144) (BuildId: 41c660c694a41af9265f00d2b0edc3)
      #02 pc 00000000000af0d0  /apex/com.android.runtime/lib64/bionic/libc.so (pthread_cond_wait+60) (BuildId: 41c660c694a41af9265f00d2b0edc3)
      #03 pc 000000000004aae8  /data/app/~~gmtCkI2Fbb1DIAQoE_udGQ==/our.package.name-2PJWDj28x17PgxCzNB_CLQ==/split_vendor_nio.apk!libOurLibrary.so (offset 0x1000) (Halide::Runtime::Internal::Synchronization::park(unsigned long long, Halide::Runtime::Internal::Synchronization::parking_control&)+244) (BuildId: 50ba462298266f6cfb907af0eb91d66abfb208df)

Thank you in advance.

closed time in 14 minutes

alexandrebodi

issue closedhalide/Halide

Back-propagation is too slow with propagate_adjoints

Hi guys, im new to Halide, and have problem with understanding why back-propagation is so slow in comparing with forward run.

Im trying to make operation for pytorch. Here is my code: https://github.com/creotiv/halide_bilateral_slice_apply

Would be very appreciate if someone can say where is a problem. Thanks

closed time in 17 minutes

creotiv

delete branch halide/Halide

delete branch : fix/5471

delete time in 19 minutes

push eventhalide/Halide

Alex Reinking

commit sha 87c9facbeee7248740ad6bd43eb5a70142049e10

Fail CMake when LLVM_LINK_LLVM_DYLIB conflicts with wasm (#5472) * Fail CMake when LLVM_LINK_LLVM_DYLIB conflicts with wasm * Update error message and add comment.

view details

push time in 19 minutes

PR merged halide/Halide

Fail CMake when LLVM_LINK_LLVM_DYLIB conflicts with wasm

attn @JanHett -- you might want to update your submodule when this is merged.

Fixes #5471

+11 -0

0 comment

1 changed file

alexreinking

pr closed time in 19 minutes

issue closedhalide/Halide

LLVM_LINK_LLVM_DYLIB and WebAssembly don't mix

LLVM's build has an option, LLVM_LINK_LLVM_DYLIB, which links "tools" to the shared LLVM library. This is ludicrous and no one should use it because it will also link the static lldWasm library to the shared LLVM library. Yet, the Linux version of Homebrew does use this.

Fortunately, the installed CMake files for LLVM tell you if this is the case: they will set LLVM_LINK_LLVM_DYLIB to something true-y (ON on my system). So the following three things cannot be true at the same time:

  1. TARGET_WEBASSEMBLY is enabled
  2. LLVM_LINK_LLVM_DYLIB is enabled
  3. Halide_SHARED_LLVM is disabled

We should check for this and error out if it happens.

closed time in 19 minutes

alexreinking

Pull request review commenthalide/Halide

Fail CMake when LLVM_LINK_LLVM_DYLIB conflicts with wasm

 endforeach () set(wasm_libs "") if (TARGET_WEBASSEMBLY)     find_package(LLD CONFIG REQUIRED HINTS "${LLVM_DIR}/../lld")++    if (LLVM_LINK_LLVM_DYLIB AND NOT Halide_SHARED_LLVM)+        message(FATAL_ERROR "LLVM was built with LLVM_LINK_LLVM_DYLIB. "

Done; merging.

alexreinking

comment created time in 19 minutes

push eventhalide/Halide

Alex Reinking

commit sha 07e5aef7261a357ac9bd15bd87384ba56d00cadf

Update error message and add comment.

view details

push time in 20 minutes

Pull request review commenthalide/Halide

Fail CMake when LLVM_LINK_LLVM_DYLIB conflicts with wasm

 endforeach () set(wasm_libs "") if (TARGET_WEBASSEMBLY)     find_package(LLD CONFIG REQUIRED HINTS "${LLVM_DIR}/../lld")++    if (LLVM_LINK_LLVM_DYLIB AND NOT Halide_SHARED_LLVM)+        message(FATAL_ERROR "LLVM was built with LLVM_LINK_LLVM_DYLIB. "

Will do

alexreinking

comment created time in an hour

Pull request review commenthalide/Halide

Fail CMake when LLVM_LINK_LLVM_DYLIB conflicts with wasm

 endforeach () set(wasm_libs "") if (TARGET_WEBASSEMBLY)     find_package(LLD CONFIG REQUIRED HINTS "${LLVM_DIR}/../lld")++    if (LLVM_LINK_LLVM_DYLIB AND NOT Halide_SHARED_LLVM)+        message(FATAL_ERROR "LLVM was built with LLVM_LINK_LLVM_DYLIB. "

It might be useful to reference this PR (and/or issue) in a comment here.

alexreinking

comment created time in an hour

delete branch halide/Halide

delete branch : pr/5455

delete time in an hour

issue commentlibuv/libuv

Is reading from a pipe fd supported in file mode on Windows

Should an ERROR_BROKEN_PIPE be considered a valid EOF on Windows?

According to ReadFile docs, yes:

If an anonymous pipe is being used and the write handle has been closed, when ReadFile attempts to read using the pipe's corresponding read handle, the function returns FALSE and GetLastError returns ERROR_BROKEN_PIPE.

mmomtchev

comment created time in an hour

issue commentgabime/spdlog

Test failure on OSX (async periodic flush)

Thanks for reporting. seems that increasing the sleep time in line 90 might fix this test. Please confirm if it helps

bluescarni

comment created time in 2 hours

issue closedgabime/spdlog

Not able to integrate SPDLOG to C++/CLI Application

I try to include header files of spdlog into my application, I'm getting errors related to atomic operations. i.e. mutex, conditional variable etc.,

How do I integrate into my C++/CLI project? Is there any wrappers available to support?

closed time in 2 hours

sk-saravana

issue closedgabime/spdlog

Used to have console color output, but not anymore.

Hi, I upgrade from spdlog 0.16.4-rc, to 1.8.1 (Checked out from master, on 2020.11.18: #eebb921) In the previous version, I was getting colored console output via 'spdlog::sinks::wincolor_stdout_sink_mt'. After the upgrade, the console output is now black and white.

I was using the previous version as header only. The new version, I'm using as shared library. I include the header like this:

#define SPDLOG_COMPILED_LIB 
#include "spdlog/spdlog.h"
...

This is all in Windows 10, 64bit. MSVC 2019, v16.8

Any ideas what I can try so that I can get color output back?

closed time in 2 hours

gamagan

issue closedgabime/spdlog

-Wundefined-func-template warning on Clang10 when using sink->set_pattern()

spdlog version: 1.8.1 compiled as library
compiler version: Clang 10.0.1 on Linux
compiler flags: -Wundefined-func-template

Minimal working example:

#include <memory>

#include "spdlog/sinks/stdout_color_sinks.h"
#include "spdlog/spdlog.h"

int main(int argc, char **argv) {
  auto console_sink = std::make_shared<spdlog::sinks::stdout_color_sink_mt>();
  console_sink->set_pattern("%^%l%$ - %v");
}

Compiler diagnostic:

/home/pha/tmp/spdlog-error/spdlog-error.cpp:8:17: warning: instantiation of function 'spdlog::sinks::ansicolor_sink<spdlog::details::console_mutex>::set_pattern' required here, but no definition is available [-Wundefined-func-template]
  console_sink->set_pattern("%^%l%$ - %v");
                ^
/home/pha/tmp/spdlog-error/spdlog/include/spdlog/sinks/ansicolor_sink.h:44:10: note: forward declaration of template entity is here
    void set_pattern(const std::string &pattern) final;
         ^
/home/pha/tmp/spdlog-error/spdlog-error.cpp:8:17: note: add an explicit instantiation declaration to suppress this warning if 'spdlog::sinks::ansicolor_sink<spdlog::details::console_mutex>::set_pattern' is explicitly instantiated in another translation unit
  console_sink->set_pattern("%^%l%$ - %v");
                ^
1 warning generated.

closed time in 2 hours

Hallot

issue commentgabime/spdlog

-Wundefined-func-template warning on Clang10 when using sink->set_pattern()

Seems like invalid/pointless warning. Forward declaring all template instantiations is just not required in cpp and is lots of work.. The linker will complain if it won't find it anyway.

Hallot

comment created time in 2 hours

push eventAcademySoftwareFoundation/OpenColorIO

Michael Dolan

commit sha 3d3eefca32b334331498fd50e2b25d89b374de7f

Add TSC meeting notes for 11-09-2020 (#1200) * Add TSC meeting notes for 11-09-2020 Signed-off-by: Michael Dolan <michdolan@gmail.com> * Clean up notes Signed-off-by: Michael Dolan <michdolan@gmail.com>

view details

push time in 2 hours

issue commentgooglefonts/noto-fonts

Missing Old Hungarian diacritics

@dscorbett David, do you understand, why I want to be closed this issue? I met a young person, who use above acute with Old Hungarian u in his own diary. I asked him, why use this form. He answered, he knew, what the real form of Old Hungarian ú, but he imagined he will use this form. He was addicted to drugs. Several years after I met him again. He already use correct Unicode forms of letters. He was clearer now. But still believe other things from the net, which aren't exactly true.

dscorbett

comment created time in 2 hours

PR closed halide/Halide

Clone #5455 for buildbots
+23 -253

0 comment

2 changed files

abadams

pr closed time in 2 hours

more